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1 

PRODUCTION OF HUMAN SERUM ALBUMIN IN 
METHYLOTROPHIC YEAST CELLS 



5 Field of the Invention 

This invention relates to a process of 
recombinant DNA technology for producing human serum 
albumin (HSA) peptides in methyl otrophic yeast such as 
Pichia pastor is. Methylotrophic yeast trans formants 

10 containing in their genome at least one copy of a DNA 
sequence opereU^ly encoding an HSA peptide under the 
regulation of a promoter region of a gene of a 
methylotrophic yeast and the S. cerevisiae alpha-mating 
factor (AMF) pre-pro sequence are cultured under 

15 conditions allowing the expression of HSA peptides into 
the culture medium. The invention further relates to the 
methylotrophic yeast trans formants, DNA fragments and 
expression vectors used for their production and cultures 
containing seuae. 

20 

Background of the Invention 

Human serum albumin (HSA) is a naturally- 
occurring, single-chain polypeptide, and is the most 
abundant protein found in the plasma of adult humans. 

25 The concentration of albimin circulating throughout the 
human body is about 40 mg/ml, corresponding to ed^out 160 
g of albumin for a 70 Kg adult male. This protein aids 
in the body's maintenance of osmotic pressure, and 
functions in the binding and transport of a wide variety 

30 of species, e.g., copper, nickel, calciiim, biliriibin, 
protoporphyrin, long chain fatty acids, prostaglandins, 
steroid hormones, thyroxine, triiodothyronine, cystine, 
and glutathione. One particularly large scale 
application of albumin is the administration of albumin 

35 to patients with circulatory failure or with albumin 

depletion. For example, it is reported that in excess of 
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10,000 kilograms of purified albumin are administered 
annually in the United States alone for such purpose. 

Naturally occurring HSA contains 585 amino 
acids. The amino acid sequence of HSA, as reported in 
5 the literature, is as set forth in Sequence ID No. 1, 
provided below. 

The amino acid sequence of HSA has been 
disclosed, for example, by Meloun et al-, in FEES Lett. 
134-137 (1975)? Lawn et al., in Nucleic Acids 
10 Research 1:6103-6114 (1981); and Dugaiczyk et al., in 
Proc. Natl. Acad. Sci. USA 71: 71-75 (1982). 

The molecule in natural form contains 17 
disulfide linkages and arises from an about 609 amino 
acid precursor molecule containing an 18 amino acid pre- 
15 peptide, a six amino acid pro-sequence, and the 585 amino 
acid mature molecule. (see, e.g., Dugaiczyk, et al.. 

Since isolation of HSA from natural sources is 
technically difficult, expensive, and time consuming, 

20 recent efforts have centered on the development of 

efficient recombinant methods for the production of HSA. 

Of the hosts widely used for the production of 
heterologous proteins, probably E. coli and Saccharomyces 
cerevisiae (Baker's yeast) are the best understood. 

25 However, E. coli tends to produce HSA in an insoluble, 
aggregate form. In addition, since E. coli produces 
endotoxins, the recombinantly expressed product must be 
subjected to extensive purification treatment to ensure 
its safety for use in human subjects - 

30 yeasts can offer clear advantages over bacteria 

in the production of heterologous proteins. Such 
advantages include the ability of yeast to secrete 
heterologous proteins into the culture medium. Secretion 
of proteins from cells is generally superior to 

35 production of proteins in the cytoplasm. Secreted 
products are obtained in a higher degree of initial 
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purity and their further purification is easier to 
accomplish without cellular debris. In the case of 
sulfhydryl-rich proteins, there is another compelling 
reason for the development of hosts capable of secreting 
5 then into the culture medium: their correct tertiary 
structure is produced and maintained via disulfide bonds. 

The secretory pathway of the cell and the 
extracellular medium are oxidizing environments which can 
support disulfide bond formation [Smith, et al.. Science > 

10 229 . 1219 (1985)]. In contrast, the cytoplasm is a 
reducing environment in which disulfide bonds cannot 
form. Upon cell breakage, too rapid formation of 
disulfide linkages can result in random disulfide bond 
formation. Consequently, production of sulfhydryl rich 

15 proteins, such as HSA, containing appropriately formed 
disulfide bonds, can be best achieved by transit through 
the secretory pathway. 

Secretion of authentic biologically active 
human serum albumin from S. cerevisiae is disclosed in 

20 Biotechnology ±i 726-730 (1986) by Etcheverry et al. 

(employing native HSA signal sequence) ; Biotechnology ^: 
42-46 (1990) by Sleep et al. , (employing five different 
leader sequences: S. cerevisiae alpha mating factor 
signal sequence, native HSA signal sequence, 

25 Kluyveromyces lactis killer signal sequence, a fusion of 
natural HSA/alpha mating factor sequences, and a fusion 
of K. lactis killer/alpha mating factor sequences) ; 
Nucleic Acids Res. 18: 1308 (1990) by Cousens et al; and 
Nucleic Acids Res. 23.' 6075-6081 (1990) by Kalman et al., 

30 (employing S. cerevisiae PH05 signal sequence) . In what 
appears to be the best experiments from among all of 
these references, HSA is produced in S. cerevisiae by 
means of an expression cassette containing a DNA 
sequence encoding mature HSA joined to sequences 

35 encoding the Kluyveromyces lactis killer signal 
sequence fused to the S. cerevisiae alpha mating 
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factor signal sequence, or the native HSA 
signal sequence fused to the S. cerevisiae alpha mating 
factor signal sequence. HSA was secreted into the shake 
flask culture medium in a concentration up to only about 
5 55 mg/1. In view of the problems usually encountered 
with up-scaling the production of heterologous proteins 
in autonomous plasmid-based yeast systems, such as S. 
cerevisiae, there is no indication that HSA production in 
S. cerevisiae could be at levels higher than those of the 

10 above-described experimental system. 

To overcome the major problems associated with 
S. cerevisiae, e.g. loss of selection for plasmid 
maintenance and problems concerning plasmid distribution, 
copy number and stability in fermentors operated at high 

15 cell density, a yeast expression system based on the 

methylotrophic yeast Pichia pastoris has been developed. 
A key feature making this system unique lies with the 
promoter employed to drive heterologous gene expression. 
This promoter, irtiich is derived from the methanol- 

20 regulated alcohol oxidase I (AOXl) gene of P. pastoris, 
is highly expressed and tightly regulated (see e.g. U.S. 
Pat. No. 4,855,231, issued August 8, 1989). Another key 
feature of the P. pastoris expression system is the 
stiUale integration of expression cassettes into the P. 

25 pastoris genome, thus significantly decreasing the chance 
of vector loss (see U.S. Pat. No. 4,882,279, issued 
November 21, 1989) . 

Cytoplasmic production of authentic 
biologically active human serum albumin from P. pastoris 

30 is disclosed in European Patent Application No. 

89107459.3, published December 6, 1989 (No. 0 344 459);. 
The cited document contains very little detail as to the 
level of production or the purity of HSA obtained. In 
what appears to be the best experiment, HSA is produced 

35 in P. pastoris by means of an expression cassette 

containing a cloned cDNA sequence encoding mature HSA 
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imder the control of the AOXl promoter. HSA was obtained 
from cells grown in shake flask culture (after cell 
breakage and centrifugation to remove cell debris) at 
concentrations up to only about 88,000 ng/ml. In view of 
5 the problems encountered with the production of HSA in 
the above-described P. pastoris yeast expression system, 
there is no indication that HSA production in P. pastoris 
could be at levels higher than those of the eUaove- 
described experimental system, or that HSA could be 

10 secreted from P. pastoris. 

Although P. pastoris has been used successfully 
for the production of various heterologous proteins, 
e.g., hepatitis B surface antigen [see Cregg et al., 
BX9/Tg«?hn<?lpqY 5., 479 (1987); and U.S. pat. No. 

15 4,895,800, issued January 23, 1990], lysozyme and 
invertase [Digem et al.. Developments in Industrial 
Microbiology 23., 59 (1988); Tschopp et al., 
Bio/Technology 5, 1305 (1987)], endeavors to produce 
other heterologous gene products in Pichia, especially by 

20 secretion, have given mixed results. At our present 
level of understanding of the P. pastoris expression 
system, it is unpredict£U3le whether a given gene can be 
expressed to an appreciable level in this yeast or 
whether Pichia will tolerate the presence of the 

25 recombinant gene product in its cells. Further, it is 
especially difficult to foresee if a particular protein 
will be secreted by P. pastoris, and if it is, at what 
efficiency. Even for S. cerevisiae, which has been 
considereJDly more extensively studied than P. pastoris, 

30 the mechanism of protein secretion is not well defined 
and understood. 



Summary of the Invention 

The present invention provides an expression 
35 system suitable for the high level production of HSA. In 
addition, the present invention provides a powerful 
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method for the production of secreted HSA peptides in 
methylotrophic yeast such as Pichia pastoris, which 
method can be easily scaled up from shake-flask 
cultures to large f ermentors with no loss in 
5 productivity and without making major changes 
in the fermentation conditions. The presently 
preferred yeast species for use in the practice of the 
present invention is Pichia pastoris, a known industrial 
yeast strain that is capable of utilizing methanol as the 

10 sole czurbon and energy source (methylotroph) . 

We have surprisingly found that HSA peptides 
can very efficiently be produced in, and secreted from, 
methylotrophic yeast such as p. pastoris, by transforming 
a methylotrophic yeast with, and preferably integrating 

15 into the yeast genome, at least one copy of a first DNA 
sequence operably encoding an HSA peptide, wherein said 
first DNA sequence is operably associated with a second 
DNA sequence encoding a secretion signal sequence 
selected from the S. cerevisiae alpha-mating factor (AMF) 

20 pre-pro sequence (including the proteolytic processing 
site: lys-arg), or the HSA signal sequence, and wherein 
both of said DNA sequences are under the regulation of a 
methanol responsive promoter region of a gene of a 
methylotrophic yeast. Methylotrophic yeast cells 

25 containing in their genome at least one copy of these DNA 
sequences efficiently produce biologically active HSA 
peptides as a medium secreted product. 

Accordingly, this invention relates to a 
methylotrophic yeast cell containing in its genome at 

30 least one copy of a DNA sequence operably encoding an HSA 
peptide, operably associated with a DNA sequence encoding 
a secretion signal sequence selected from the S. 
cerevisiae AMF pre-pro sequence (including the 
proteolytic processing site: lys-arg), or the HSA signal 

35 sequence, wherein both the coding sequence and signal 
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sequence are maintained under the regulation of a 
promoter region of a gene of a methylotrophic yeast. 

According to another aspect, this invention 
relates to a DNA fragment containing at least one copy of 
5 an expression cassette comprising in the reading frame 
direction of transcription, the following DNA sequences: 

(i) a promoter region of a methanol responsive 
gene of a methylotrophic yeast, 

(ii) a DNA sequence encoding a polypeptide 
10 consisting of: 

(a) a secretion signal sequence selected 
from: 

(1) the S. cerevisiae AMF pre-pro 
sequence, including the 

15 proteolytic processing site: 

lys-arg, or 

(2) the native HSA secretion signal 
sequence , and 

(b) a DNA sequence encoding an HSA 
20 peptide; and 

(iii) a transcription terminator functional in a 
methylotrophic yeast, wherein said DNA sequences are 
operationally associated with one emother for 
transcription of the sequences encoding said 
25 polypeptide. 

The DNA fragment according to the invention can 
be transformed into the methylotrophic yeast cells as a 
linear fragment flanked by DNA sequences having 
sufficient homology with a target gene to effect 
30 integration of said DNA fragment therein. In this case 
integration takes place by replacement at the site of the 
target gene. Alternatively, the DNA fragment can be part 
of a circular plasmid, which may be linearized to 
facilitate integration, and will integrate by addition at 
35 a site of homology between the host and the plasmid 
sequence. 



wo 92/13951 



PCT/US92/01015 



-8- 

The invention further concerns an expression 
vector containing at least one copy of an expression 
cassette as described hereinabove. 

According to a still further embodiment, the 
5 invention relates to a process for producing HSA peptides 
by growing methylotrophic yeast trans formemts containing 
in their genome at least one copy of a DNA sequence 
operably encoding an HSA peptide, operably associated 
with WK encoding a secretion signal sequence selected 

10 from the S. cerevisiae AMF pre-pro sequence, or the HSA 
signal sequence, wherein both the coding sequence and the 
signal sequence are maintained under the regulation of a 
promoter region of a methanol responsive gene of a 
methylotrophic yeast, under conditions allowing the 

15 expression of said DNA sequence in said transformants and 
secreting mature HSA peptides into the culture medium. 
Cultures of viable methylotrophic yeast cells capable of 
producing HSA peptides are also within the scope of the 
invention. 

20 The polypeptide product is secreted into the 

culture medium at surprisingly high concentrations; the 
level of secretion of HSA peptides is at least an order 
of magnitude higher than the best results published in 
the literature. These present, excellent results are 

25 due, in part, to the surprising fact that the S. 
cerevisiae AMF pre-pro sequence and the HSA signal 
sequence function unexpectedly well to direct secretion 
of HSA peptides in methylotrophic yeast such as P- 
pastoris . 

30 The present invention is directed to the above 

aspects and all associated methods and means for 
accomplishing such. For example, the invention includes 
the technology requisite to suitable growth of the 
methylotrophic yeast host cells, fermentation, and 

35 isolation and purification of the HSA gene product. 
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P. pastoris is described herein as a model 
system for the use of methylotrophic yeast hosts. Other 
useful methylotrophic yeasts can be taken from four 
genera, namely Candida, Hsinensula, Pichia and Torulopsis. 
5 Equivalent species from them may be used as hosts herein 
primarily based upon their demonstrated characterization 
of being supportable for growth and exploitation on 
methanol as a single carbon nutriment source. See, for 
example, Gleeson et al.. Yeast ±, 1 (1988). 

10 

Brief Description of the Drawings 

Figxire 1 is a restriction map of plasmid 

PA0815. 

Figure 2 is a restriction map of plasmid 

15 pA0856. 

Figure 3 is a restriction map of plasmid 

pHSAlll. 

Figure 4 is a restriction map of plasmid 

PHSA211. 

20 Figure 5 is a restriction map of plasmid 

PHSA212. 

Figure 6 is a restriction map of plasmid 

PHSA214. 

Figure 7 is a restriction map of plasmid 

25 PHSA216. 

In each of the restriction maps provided 
herein, restriction sites employed for manipulation of 
DNA fragments, but which are destroyed upon ligation, are 
indicated by enclosing the notation for the destroyed 

30 sites in parenthesis. 

For the multi-copy vectors pHSA212, pHSA214 and 
PHSA216, restriction sites within the "aMF-HSA" construct 
are not indicated, but are the same as shown in Figure 4 
for vector pHSA211. Similarly, the non-HSA encoding 

35 BamH I- Cla l fragment in each of vectors pHSA212, pHSA214 
and pHSA216 has the same restriciton sites as shown in 
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Figxire 4 for the corresponding MmHI-Cial fragment of 
vector pHSA211. 

petailed Description of the Invention 
5 The term "human serum albumin" or "HSA peptide" 

or sin«)ly "HSA", as used throughout the specification and 
in the claims, refers to a polypeptide product which 
eafihibits similar, in-kind, biological activities to 
natural human serum albumin (HSA) , as measured in 

10 recognized bioassays, and has substantially the same 

amino acid sequence as HSA. It will be understood that 
polypeptides deficient in one or more amino acids in the 
amino acid sequence reported in the literature for 
naturally occurring HSA, or polypeptides containing 

15 additional amino acids, or polypeptides in which one or 
more amino acids in the amino acid sequence of natural 
HSA eure replaced by other amino acids are within the 
scope of the invention, provided that they exhibit the 
functional activity of HSA, e.g., aid in the body's 

20 maintenance of osmotic pressure, and function in the 

binding and transport of a wide variety of species, e.g., 
copper, nickel, calcium, bilirubin, protoporphyrin, long 
chain fatty acids, prostaglandins, steroid hormones, 
thyroxine, triiodothyronine, cystine, and glutathione. 

25 The invention is intended to embrace all the 

allelic variations of HSA. Moreover, derivatives 
obtained by simple modification of the amino acid 
sequence of the naturally occurring product, e.g, by way 
of site-directed mutagenesis or other standard 

30 procedures, are included within the scope of the present 
invention. HSA forms produced by proteolysis of host 
cells that exhibit similar biological activities to 
mature, naturally occurring HSA are also encompassed by 
the present invention. 

35 The amino acids, which occur in the various 

amino acid sequences referred to in the specification 
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have their usual, three- and one-letter abbreviations 



routinely used in the art, i.e.: 





Amino Acid 


Abbreviation 




L- Alanine 


Ala 


A 


5 


L-Arginine 


Arg 


R 




L-Asparagine 


Asn 


N 




L-Aspartic acid 


Asp 


D 




L-Cysteine 


Cys 


C 




L-Glutamine 


Gin 


Q 


10 


I*-Gluteunic Acid 


Glu 


£ 




L-Glycine 


Gly 


G 




L-Histidine 


His 


H 




L-Isoleucine 


He 


I 




L-Leucine 


Leu 


L 


15 


L-Lysine 


Lys 


K 




L-Methionine 


Met 


M 




L-Phenylalanine 


Phe 


F 




L-Proline 


Pro 


P 




L-Serine 


Ser 


S 


20 


L-Threonine 


Thr 


T 




L-Tryptophan 


Trp 


W 




L-Tyrosine 


Tyr 


Y 




L-Valine 


Val 


V 



25 According to the invention, HSA peptides are 

produced by methylotrophic yeast cells containing in 
their genome at least one copy of a DNA sequence opereOoly 
encoding HSA peptides operably associated with DNA 
encoding a secretion signal sequence selected from (1) 

30 the S, cerevisiae AMF pre-pro sequence (including the 

proteolytic processing site: lys-arg) , or (2) the native 
HSA secretion signal sequence, both under the regulation 
of a promoter region of a methanol responsive gene of a 
methylotrophic yeast. 

35 The term "a DNA sequence operably encoding HSA 

peptides" as used herein includes DNA sequences encoding 
the 585 amino acid form of HSA or any other "HSA peptide" 
as defined hereinabove. DNA sequences encoding HSA are 
Jcnovm in the art. They may be obtained by chemical 

40 synthesis, or by transcription of messenger RNA (mRNA) 
corresponding to HSA into complementary DNA (cDNA) and 
converting the latter into a double stranded cDNA. The 
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mRNA can be isolated for example, from human liver (see 
Dugaiczyk, et al . , supra ) . Chemical synthesis of a 
gene for HSA is, for example, disclosed by Kalman et al-. 
Supra . The requisite DNA sequence can also be removed, 
5 for example, by restriction enzyme digest of known 

vectors harboring the HSA gene. Examples of such vectors 
and the means for their preparation can be taken from the 
following publications: Etcheverry et al., supra ; Sleep 
et al., supra ; Cousens et al, supra ; and Kalmem et al., 

10 supra , etc. The structure of a preferred HSA gene used 
in accordzmce with the present invention is further 
elucidated in the examples. 

The presently preferred promoter region 
employed to drive the HSA gene expression is derived from 

15 a methiuiol-regulated alcohol oxidase gene of P. pastoris. 
P. pEistoris is known to contain two functional alcohol 
oxidase genes: alcohol oxidase I (AOXl) and alcohol 
oxidase II (AOX2) genes. The coding portions of the two 
AOX genes are closely homologous at both the DNA and the 

20 predicted amino acid sequence levels and share common 
restriction sites. The proteins expressed from the two 
genes have similar enzymatic properties, but the promoter 
of the AOXl gene is more efficient and highly expressed. 
Therefore, the use of the AOXl gene is preferred for HSA 

25 expression- The AOXl gene, including its promoter, has 
been isolated eind thoroughly characterized [Ellis et al., 
Mol, Cell. Biol. 5, 1111 (1985)]. 

The expression cassette used for transforming 
methylotrophic yeast cells contains, in addition to a 

30 methanol responsive promoter of a methylotrophic yeast 
gene and the HSA encoding DNA sequence (HSA gene) , a DNA 
sequence encoding a secretion signal sequence selected 
from the in-reading frame S. cerevisiae AMF pre-pro 
sequence, including a DNA sequence encoding the 

35 processing site: lys-arg (also referred to as the lys- 
arg encoding sequence) , or the native HSA secretion 
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signal sequence; and a transcription terminator 
functional in a methylotrophic yeast. 

The S. cerevisiae alpha-mating factor is a 13- 
residue peptide, secreted by cells of the ••alpha** mating 
5 type, that acts on cells of the opposite ••a** mating type 
to promote efficient conjugation between the two cell 
types and thereby formation of ••a-alpha" diploid cells 
[Thomer et al.. The Molecular Biology the Yeast 
Saccharomyces . Cold Spring Harbor Laboratory, Cold Spring 

10 Harbor, NY, 143 (1981)]. The AMF pre-pro sequence is a 
leader sequence contained in the AMF precursor molecule, 
and includes the lys-arg encoding sequence which is 
necessary for proteolytic processing and secretion (see 
e.g. Brake et al.. Supra ^ . The AMF pre-pro sequence, 

15 including the lys-arg encoding sequence is a 255 bp 
fragment which is set forth as Sequence ID No. 2. 

The native HSA secretion signal sequence is a 
24 £uaino acid fragment, comprising an 18 cunino acid pre- 
peptide, and a six zunino acid pro-sequence. These 

20 sequences have been described by Dugaiczyk, et al., 

supra . The nucleotide sequence employed in the practice 
of the present invention encodes the native HSA secretion 
signal sequence, employing yeast-preferred codons. See 
Sequence ID No. 3 for the specific sequence employed 

25 herein. Those of skill in the art recognize that 

numerous other nucleotide sequences will also encode the 
desired 24 aunino acid pre-pro sequence. Such sequences 
could readily be used instead of the speci-'^ic sequence 
set forth in Sequence ID No. 3. 

30 The transcription terminator functional in a 

methylotrophic yeast used in accordance with the present 
invention has a siibsegment which encodes a polydenylation 
signal and polydenylation site in the transcript and/or a 
subsegment which provides a transcription termination 

35 signal for transcription from the promoter used in the 
expression cassette according to the invention (the term 
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"expression cassette" as used herein and throughout the 
specification and claims refers to a DNA sequence which 
includes sequences functional for the expression and 
secretion process) . The entire transcription terminator 
5 is talcen from a protein-encoding gene, which may be the 
same or different from the gene which is the source of 
the promoter used according to the invention. 

For the practice of the present invention it is 
preferred that multiple copies of the above-described 
10 expression cassettes be contained on one DNA fragment, 
preferably in a head-to-tail orientation. It is 
particularly preferred that four or more copies of the 
above-described expression cassette be contained on one 
DNA fragment. 

15 The DNA fragments according to the invention 

optionally further con^rise a selectable marker gene. 
For this purpose, any selectable marker gene functional 
in methylotrophic yeast such as P. pastor is may be 
employed, i.e., any gene which confers a phenotype upon 

20 methylotrophic yeast cells, thereby allowing them to be 
identified and selectively grown from among a vast 
majority of untransformed cells. Suitable selectable 
marker genes include, for example, selectable marker 
systems composed of an auxotrophic mutant P. pastor is 

25 host strain and a wild type biosynthetic gene which 
complements the host's defect. For transformation of 
his4- P. pastoris strains, for example, the S. cerevisiae 
or V. pastoris HIS4 gene, or for transformation of arg4* 
mutants, the S. cerevisiae ARG4 gene or the P. pastoris 

30 ARG4 gene, may be employed. 

If the yeast host is transformed with a linear 
DNA fragment containing the HSA gene under the regulation 
of a promoter region of a P. pastoris gene and sequences 
necessary for processing and secretion, the expression 

35 cassette is integrated into the host genome by any of the 
gene addition or replacement techniques known in the art. 
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such as by one-step gene replacement [see e.g., 
Rothstein, Methods Enzvmol. 101. 202 (1983); Cregg et 
al., Bio/Technoloav 5, 479 (1987)] or by two-step gene 
replacement methods [see e.g., Scherer and Davis, Proc. 
5 Natl. Acad. Sci. USA . 7f, 4951 (1979)]. In the gene 
replacement technique, the linear DNA fragment is 
directed to the desired locus in the genome of the host 
orgemism, i.e., to the tsirget gene to be disrupted, by 
means of flanking DNA sequences having sufficient 

10 homology with the target gene to effect integration of 
the DNA fragment therein. One-step gene disruptions are 
usually successful if the DNA to be introduced has as 
little as 0.2 3cb homology Vith the fragment locus of the 
target gene; it is however, prefersUale to maximize the 

15 degree of homology for efficiency. 

If the DNA fragment according to the invention 
is contained within or is an eacpression vector, e.g., a 
circular plasmid, one or more copies of the plasmid can 
be integrated at the same or different loci, by addition 

20 to the genome, instead of by gene disruption. 

Linearization of the plasmid by means of a suitable 
restriction endonuclease facilitates such integration. 

The term "expression vector" includes vectors 
capable of expressing DNA sequences contained therein, 

25 where such sequences are in operational association with 
other sequences capable of effecting their expression, 
i.e., promoter sequences. In general, expression vectors 
usually used in recombinant DNA technology are often in 
the form of "plasmids** , i.e., circular, double-stranded 

30 DNA loops which, in their vector form, are not bound to 
the chromosome. In the present specification the terms 
"vector" and "plasmid" are used interchangeaibly. 
However, the invention is intended to include other forms 
of expression vectors as well, which function 

3 5 equivalently . 
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In the DNA fragment according to the invention 
the segments of the expression cassette are 
"operationally associated" with one another. The DNA 
sequence encoding HSA peptides is positioned and oriented 
5 functionally with respect to the promoter, the DNA 
sequence encoding the secretion signal sequence, i.e., 
the S. cerevisiae AMF pre-pro sequence (including the DNA 
sequence encoding the AMF processing-site: lys-arg) or 
the HSA signal sequence, and the transcription 

10 terminator. Thus, the polypeptide encoding segment is 
transcribed, under regulation of the promoter region, 
into a transcript capable of providing, upon translation, 
the desired polypeptide. Because of the presence of the 
AMF or HSA signal sequence, the expressed HSA product is 

15 found as a secreted entity in the culture medium. 

Appropriate reading frame positioning and orientation of 
the various segments of the expression cassette are 
within the knowledge of persons of ordinary skill in the 
art; further details are given in the Examples - 

20 The DNA fragment provided by the present 

invention may include sequences allowing for its 
replication and selection in bacteria, especially E. 
coli. In this way, large quantities of the DNA fragment 
can be produced by replication in bacteria. 

25 Methods of transforming methylotrophic yeast 

(such as Pichia pastoris) as well as methods applicable 
for culturing methylotrophic yeast cells containing in 
their genome a gene for a heterologous protein are known 

generally in the art. 

30 According to the invention, the expression 

cassettes are transformed into the cells of a 
methylotrophic yeast by the spheroplast technique, 
described by Cregg et al., Wnl . Cell. Biol. 5, 3376 
(1985). see also U.S. Pat. No. 4,879,231, issued 

35 November 7, 1989. Alternatively, the expression 
cassettes can be transformed into the cells of a 
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methylotrophic yeast or by the whole-cell lithium 
chloride yeast transformation system [Ito et al., hqxiQf 
Biol. Chem. 18, 341 (1984)], with modification necessary 
for adaptation to P. pastoris [See U.S. Pat. No. 
5 4,929,555, issued May 29, 1990], For the purpose of the 
present invention the spheroplast method is preferred, 
primarily since it yields a greater number of 
trans foraants . 

Positive transformants are characterized by 

10 Southern blot analysis [Haniatis et al.. Molecular 
Cloning; A Laborat orv Manual, Cold Spring Harbor 
Laboratory Press, Cold Spring Harbor, New York, USA 
(1982)] for the site of DNA integration. Northern blots 
[Maniatis, Op. Cit. . R.S. Zitomer and B.D. Hall, J, gigl. 

15 asffi, Z5Xf 6320 (1976)] for methanol-responsive HSA gene 
expression, and product analysis for the presence of 
secreted HSA peptides in the growth media. 

Transformed strains, which are of the desired 
phenotype zmd genotype, are grown in fermentors. For the 

20 large-scale production of recombinant DNA-based products 
in methylotrophic yeast, a three-stage, high cell- 
density, batch fermentation system is normally employed. 
In the first, or growth stage, expression hosts are 
cultured in defined minimal medium with excess glycerol 

25 as carbon source. When grown on this carbon source, 
heterologous gene expression is completely repressed, 
which allows the generation of cell mass in the absence 
of heterologous protein expression- Next a short period 
of glycerol limitation growth is allowed. Subsecjuent to 

30 the glycerol limited growth, methanol is added, 

initiating the expression of the desired heterologous 
protein. This third stage is the so-called production 
stage . 

The term "culture" means a propagation of cells 
35 in a medium conducive to their growth, and all sub- 
cultures thereof. The term "subculture" refers to a 
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culture of cells grown from cells of another culture 
(source culture) , or any subculture of the source 
culture, regardless of the number of subculturings which 
have been performed between the subculture of interest 
5 and the source culture. 

According to a preferred embodiment of the 
invention, the heterologous protein expression system 
used for HSA production utilizes the promoter derived 
from the methanol-regulated AOXl gene of P. pastoris, 

10 which is very efficiently expressed and tightly 
regulated. This gene can be the soxirce of the 
transcription terminator as well. The presently 
preferred expression cassette comprises, operationally 
associated with one another, a P- pastoris AOXl promoter, 

15 DNA encoding a secretion signal sequence selected from 
the S. cerevisiae AMF pre-pro sequence (including the DNA 
sequence encoding the AMF processing site: lys-arg) or 
the native HSA signal sequence, a DNA sequence encoding 
mature HSA, and a transcription terminator derived from 

20 the P- pastoris AOXl gene. Preferably, two or more of 
such expression cassettes are contained on one DNA 
fragment, in head-to-tail orientation, to yield multiple 
expression cassettes on a single contiguous DNA fragment. 
The presently preferred host cells to be 

25 transformed with multiple expression cassettes are P- 
pastoris cells having at least one mutation that can be 
complemented with a marker gene present on a transforming 
DNA fragment. Preferably his4- (GS115) or argA" (GS190) 
auxotrophic mutant P. pastoris strains are employed. 

30 The fragment containing multiple expression 

cassettes is inserted into a plasmid containing a marker 
gene complementing the hosfs defect. pBR322-based 
plasmids, e.g., pA0856, are preferred. Insertion of one 
or multiple copies of the hHSA express ion/ secretion 

35 cassette into parent plasmid pA0856 produces plasmids 
pHSAlll, PHSA211, PHSA212, pHSA214 and pHSA216. 
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To develop Mut" expression strains of P. 
pastor is (Mut refers to the methanol -utilization 
phenotype) , the transforming DNA comprising the 
expression cassette (s) is preferably integrated into the 
5 host genome by a one-step gene replacement technique. 
The expression vector is digested with an appropriate 
enzyme to yield a linear DNA fragment with ends 
homologous to the AOXl locus by meems of the flanking 
homologous sequences. This approach avoids the problems 

10 encoiintered with S. cerevisiae, wherein expression 
cassettes must be present on multicopy plasmids to 
achieve high level of expression. As a result of gene 
replacement, Mut~ strains are obtained. In Mut' strains, 
the AOXl gene is replaced with the expression 

15 cassette (s) , thus decreasing the strains 2U3ility to 
utilize methanol. A slow growth rate on methanol is 
maintained by expression of the A0X2 gene product. The 
tramsformants in which the expression cassette has 
integrated into the AOXl locus by site-directed 

20 recombination can be identified by first screening for 
the presence of the complementing gene. This is 
preferably accomplished by growing the cells in a media 
lacking the complementing gene product and identifying 
those cells which are able to grow by nature of 

25 expression of the complementing gene. Next, the selected 
cells are screened for their Mut phenotype by growing 
them in the presence of methanol and monitoring their 
growth rate. 

To develop Mut* HSA-expressing strains, the 

30 fragment comprising one or more expression cassette (s) 
prefercibly is integrated into the host genome by 
transformation of the host with a circular plasmid 
comprising the expression cassette (s) (wherein the 
plasmid has been linearized by cutting at a unique 

35 restriction site to enhance integration) . The 

integration occurs by addition at a locus or loci having 
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homology with one or more sequences present on the 
transformation vector. 

Positive transfonnants are characterized by 
Southern analysis for the site of DNA integration, by 
5 Northern analysis for methanol -responsive HSA gene 

escpression, and by product analysis for the presence of 
secreted HSA peptides in the growth niedia. 
Methylotrophic yeast strains which have integrated one or 
multiple copies of the expression cassettes at a desired 

10 site can be identified by Southern blot analysis. 

Strains which demonstrate enhanced secretion of HSA may 
be identified by Northern or product analysis; however, 
this characteristic is not always easy to detect in 
shake- flask experiments. 

15 Methylotrophic yeast transformants which are 

identified to have the desired genotype and phenotype are 
grown in fermentors. Typically a three-step production 
process is used. Initially, cells are grown on a 
repressing carbon source, preferably excess glycerol. In 

20 this stage the cell mass is generated in the absence of 
HSA expression. Next, a short period of glycerol 
limitation growth is allowed. After exhaustion of 
glycerol, methanol alone (methanol excess fed-batch mode) 
or limiting glycerol and methanol (mixed-feed fed-batch 

25 mode) are added in the fermentor, resulting in the 
expression of the HSA gene driven by a methanol 
responsive promoter. The level of HSA secreted into the 
media can be determined in a variety of ways, e.g., by 
Western blot analysis of the media in parallel with an 

30 HSA standard, using anti-HSA antisera, or by HPLC after 
suitable pretreatment of the medium - 

The invention is further illustrated by the 
following non-limiting exeunples. 
Examples 

35 gxample _1 
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The expression vector constructions disclosed 
in the present application were performed using standard 
procedures, as described, for exaunple in Maniatis et al.. 
Molecular Cloning: A Laboratory Manual, Cold spring 
5 Harbor Laboratory Press, Cold Spring Harbor, New York, 
USA (1982) and Davis et al., Basic Meth ods in Molecular 
Biology . Elsevier Science Publishing, Inc., New York 
(1986) . 

All fragments described in this Example were 
10 isolated on 0.8% agarose gels. All vectors were digested 
with the appropriate enzyme then treated with calf 
alkaline phosphatase. In each case, approximately 10-20 
ng of vector were ligated with approximately 100-200 ng 
of insert. Cells of E. coli strain MC1061 were used as 
15 host for transformation of non-M13 vectors; cells of E. 
coli strain JM103 were used as host for M13 vectors. 

The HSA gene was obtained from a pBR322-based 
plasmid on a 1800 bp Hin dlll- Eco Rl fragment. The HSA 
encoding fragment employed is set forth in Sequence ID 
20 No. 1. 

The AMF pre-pro encoding sequence (including 
the proteolytic processing site: lys-arg) employed in the 
present study was a 255 nucleotide fragment set forth in 
Sequence ID No. 2. 
25 This 255 nucleotide fragment was derived from 

plasmid pAO203. The construction of plasmid pAO203 is 
described in detail in Exauople If below. 

a. Construction of plasmid pHSA211, a one 
30 copy vector employing the AMF signal 

sequence : 

Plasmid pHSA211 was constructed as follows: 

The synthetic structural gene encoding HSA 
employed herein, including a codon for methionine at the 
35 5» end, was obtained as an approximately 1800-bp Eindlll- 
EcoRI fragment in plasmid pMET-HSA [Sequence ID No. 1; 
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see also Kalman, et al., in H^if^l^^^- ^gi<^s fie^> 1^:6075 
(1990)]. The synthetic gene was prepared with pj-gftl a- 
pref erred codons. Plasmid pMET-HSA was used to transform 
g. coli strain MC1061. Ampicillin resistant colonies 
5 were selected, and DNA from six colonies was analyzed by 
digestion with gindlll and ££oRI. All six transformants 
displayed the expected approximately l800-bp band. One 
colony was selected for a large-scale plasmid 
preparation. Subsequently, the entire aindlll-EsflRI 

10 insert was sequenced, and was found to agree with that of 
the original ONA, 

The "1800 bp Hindlll-EsoRI fragment from pMET- 
HSA was isolated and inserted into M13mpl0, creating 
vector pHSAlOl. Site-directed mutagenesis [Zoller, M.J. 

15 and smith, M. in Methods in Bnzvmoaoqy 1M:468 (Wu, R. , 
Grossman, L. , and Moldave, K. , eds.) Academic Press, New 
York (1983) 1 was used to insert ficeRI and laaHI sites at 
the 3' end of the HSA gene, immediately following the 
transcription termination codon. The mutagenizing and 

20 screening oligonucleotides were of the following 
sequence : 

mutagenizing oligo #1 (SEQ ID No. 4): 
TGCTTTGGGTTTGTAAGAATTCGGATCCCGTAATCATGGTCAT 

screening oligo #1 (SEQ ID No. 5) : 
25 TTGTAAGAATTCGGATCCCGTAAT 

Single-Stranded vector was prepared from positive plaques 

and used to transform JM103 cells. The resultant plaques 

were screened again and single positives were sequenced. 
30 One mutagenized clone, pHSA102, was selected and 

sequenced to verify the addition of EcoRI and BamHI 

restriction sites. 

Plasmid pHSA102 was then used for a second 

site-directed mutagenesis to remove the StuI site within 
35 the gene. Although this Stu I site is methylated in DNA 

prepared from E. coli strain MC1061, and as a result 
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cannot be cleaved by the enzyme, deletion of this site 
was performed so that unmethylated DNA from yeast 
transformants generated with this gene could easily be 
analyzed by Southern hybridization. Removal of the site 
5 was accomplished without alteration of the emino acid 
sequence; however a fiindlll site was created by the 
chzmge. The autagenizing and screening oligonucleotides 
were of the following sequence: 



10 nutagenizing oligo #2 (SEQ ID No. 6) : 

TGAAAGAGCCTTCAAAGCTTGGGCTGTTGCTAGATT 
screening oligo #2 (SEQ ID No. 7): 
CTTCAAAGCTTGGGCTG 



15 The resulting mutagenized clone, pHSA103, was sequenced 
to verify the cheuige. 

The approximately 1800-bp flindlll-fiaattHl 
fragment from pHSA103 was isolated and inserted into 
flindlll-JBflJBHI -digested plasmid pA0203. Plasmid pAO203 is 

20 comprised of the DNA sequence encoding the ceMF pre-pro 
region (with an fisfiRI site at the 5 ' end) , followed by 
nucleotides which encode the amino acids for processing 
sites, lys-arg and (glu-ala)2, which in turn are followed 
by a aindlll site at the 3» end. Construction of pAO203 

25 is described in Example If. 

The resulting plasmid, pHSA201, contains, on an 
Eco RI fragment, DNA encoding the 8 3 -amino acid leader 
sequence of the otMF prepro region followed by the 
processing sites, lys-arg and (glu-alajj* joined to DNA 

30 encoding the HSA gene. The approximately 2050-bp JEco RI 
fragment was isolated from pHSA20l and cloned into 
plasmid M13mpl8. Transformants were screened with JObal 
for the presence of an ~1500-bp band, which indicates 
that the £cgRI insert is oriented such that the single- 

35 stranded template will contain the emtisense strand. The 
resulting vector, pHSA202, was subjected to site-directed 
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mutagenesis in order to delete the DNA for the (glu-ala)^ 
processing sites, the polylinker from the original pMET- 
HSA plasmid, and the codon for methionine at the 5' end 
of the gene. The mutagenizing and screening oligos were 
of the following sequences: 

mutagenizing oligo #3 (SEQ ID No- 8) : 

GTATCTTTGGATAAAAGAGACGCTCACAAGTCTGAAGT 

screening oligo #3 (SEQ ID No. 9) : 

GATAAAAGAGACGCTCAC 



The mutagenized plasmid, pHSA203, %ras sequenced to verify 
that the gene for HSA was fused directly to the oMF 
prepro region and lys-arg processing site. 

The DNA encoding the prepro oMF-HSA fusion gene 
15 was isolated on an approximately 2000-bp fissfiRI fragment 
from PHSA203 and inserted into the unique EcfiRI site of 
^i^hia pastor is vector pA0856, described in Example Ij. 
The ligation was transformed into MC1061 cells and amp" 
colonies were selected. Positives were identified by the 
20 presence of approximately 3450 and 6500 bp bands upon 
digestion with Sail. The resulting expression vector, 
PHSA211 (see Figure 4), comprises one copy of an 
expression cassette encoding the pre-pro oMF-HSA fusion 
gene under the control of the Pgchi^ pastorjs ^OJCl 
25 promoter and regulatory regions, as well as the ^QXl 
transcription termination and polyadenylation signals. 
In addition, the vector includes the Pichia pastoris HIS4 
genfe used for selection in His' hosts and additional 3' 
AOXl sequences that can be used to direct integration 
30 into the host genome. 

The DNA for the entire HSA insert as well as 
approximately 65 nucleotides each of the promoter and 
termination regions of pHSA211 were sequenced to verify 
that no changes had occurred during cloning. 
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b. Construction of plasmid pHSAlll. a one 

copy vector em ploying the native HSA signal 
seouence ; 

Plasmid pHSAlll is identical to plasmid 
5 pHSA211, except pHSAlll contains a signal sequence 
encoding the native 24-amino acid HSA secretion signal 
sequence (instead of the AMF secretion signal sequence) . 
The signal sequence was designed, using yeast preferred 
codons, to (1) place an EsqRI site innnediately 5* of the 

10 initiation codon, and (2) to delete the codon for 
methionine from the 5 '-end of the HSA gene in vector 
pMET-HSA. To accomplish these changes, plasmid pHSA103 
was subjected to a two-step mutagenesis as previously 
described. The oligonucleotides employed for mutagenesis 

15 were as follows: 

mutagenizing oligo |1 (SEQ ID No. 10) : 

GCT TGC ATG CCT GCA GAA TTC ATG AAG TGG GTT ACT TTC ATT 
TCT TTG TTG TTC GAC GCT CAC AAG TCT 
screening oligo #1 (SEQ ID No. 11) : 
20 ACT TTC ATT TCT TTG TTG 

mutagenizing oligo #2 (SEQ ID No. 12) : 

ATT TCT TTG TTG TTC TTG TTC TCT TCT GCT TAC TCT AGA GGT 

GTT TTC AGA AGA GAC GCT CAC AAG TCT 

screening oligo #2 (SEQ ID No. 13) : 
25 TCT GCT TAC TCT AGA GGT 

The mutagenized clone, pHSAlOS, was sequenced 

to verify the addition of the signal sequence and t..a 

EcoR I site as well as the deletion of the rodon for 

methionine. The sequence of the final mutagenized gene 
30 in pHSAlOS comprises the HSA leader sequence set forth in 

Sequence ID No. 3, fused to the HSA coding sequence set 

forth in Sequence ID No. 1. 

The HSA gene and signal sequence was isolated 

on an approximately 1830-bp EcoR I fragment from pHSAlOS 
35 and inserted into the unique Eco RI site of Pichia 

pastoris vector pA0856 (Example 1 j ) . The resulting 
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expression vector, pHSAlll (Figure 3), contains one copy 
of an expression cassette comprising the HSA structural 
gene and signal setjuence under the control of the £is&ia 
pastoris ftOXl promoter and regulatory regions, as well as 
5 the AQXl transcription termination and polyadenylation 
signals. In addition, the vector includes the £isbia 
pastoris HIS4 gene used for selection in His* hosts and 
additional 3* ^0X1 sequences that can be used to direct 
integration into the host genome. 
10 The DNA for the entire HSA insert as well as 

approximately 65 nucleotides each of the promoter and 
termination regions of pHSAlll were sequenced to verify 
that no changes had occurred during cloning. 

c. construction of Dla Rinid pHSA21?, a tVO <?PPY 
15 vector emPl nyina the AMF sicmal sgcwgnOS; 

The approximate 3300 bp HSA expression cassette 
was isolated from pHSA211 on a eial-BasHl fragment and 
inserted back into the £lal-£glll sites of pHSA211. The 
ligation was transformed into MC1061 cells and amp" 
20 colonies were selected. Correct transformants 

demonstrated "6570 and "6525 bp Clal-BaffiHI fragments upon 
digestion. The resulting vector, pHSA212 (see Figure 5) , 
contains two copies of the expression cassette linked as 
tandem-repeat units, as verified by restriction enzyme 

25 digests - 

d. Construction of p^ asmid DHSA214, a fpur 

nnp y vector employing t be AMF signal sequencer 
The approximate 6500 bp ClaI-Ba]^I fragment 
from PHSA212, containing two copies of the expression 
30 cassette, was isolated and inserted back into the Clal- 
Bglil sites of pHSA212. The ligation was transformed 
into MC1061 cells and ampR colonies were selected, 
correct transformants demonstrated "13,000 and 6500 £1^1- 
BamHI fragments upon digestion. The resulting vector, 
35 PHSA214 (see Figure 6) . contains four copies of the 
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expression cassette linked as tandem- repeat tinits, as 
verified by restriction enzyme digests. 

e. Construct ion of plasmid DHSA216, a six 

co py vector employing the AMF signal sequence : 
5 The approximate 6500 bp m ax-BamHT fragment 

from PHSA212, containing two copies of the expression 
cassette, was inserted into the filal-fiaill sites of 
pHSA214. The ligation was transformed into MC1061 cells 
md amp" colonies were selected. Correct transformzmts 
10 demonstrated '19,500 and '6500 £laI-fiaffiHX fragments upon 
digestion. The resulting vector, pHSA216 (see Figure 7), 
contains six copies of the expression cassette linked as 
tandem-repeat units, as verified by restriction enzyme 
digests . 

15 f. Construct ion of plasmid PAO203 

The AOXl transcription terminator was isolated 
from 20 /ig of pPG2.0 [pPG2.0 = BamHI-Hindlll fragment of 
pG4.0 (NKRL 15868) + pBR322] by StuI digestion followed 
by the addition of 0.2 Mg Sail linkers (GGTCGACC) . The 

20 plasmid was stibsequently digested with flindlll and the 
350 bp fragment isolated from a 10% acrylamide gel and 
subcloned into pUClS (Boehringer Mannheim) digested with 
flindlll and Sai l. The ligation mix was trMsformed into 
JM103 cells (that are widely available) and amp" colonies 

25 were selected. The correct construction was verified by 
Hin di II and Sai l digestion, which yielded a 350 bp 
fragment, and was called pA0201. 

5 fig of pA0201 was digested with Hin dlll , 
filled in using Klenow polymerase, and 0.1 ^9 of figill 

30 linkers (GAGATCTC) were added. After digestion of the 
excess Bgl ll linkers, the plasmid was reclosed and 
transformed into MC1061 cells. Amp** cells were selected, 
DNA was prepared, and the correct plasmid was verified by 
Bql ll, Sai l double digests, yielding a 350 bp fragment, 

35 and by a fiiodlll digest to show loss of Hindlll site. 
This plasmid was called pAO202. 
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10 



The alpha factor-GRF fusion was isolated as a 
360 bp MffiHI-Pstl partial digest from pYSV201. Plasmid 
PYSV201 is the ficoRI-S^I fragment of GRF-E-3 inserted 
into H13mpl8 (New England Biolabs) . Plasmid GRF-E-3 is 
described in EP 206,783. 20 Mg of pySV201 plasmid was 
digested with MeHI and partially digested with £stl. To 
this partial digest was added the following 
oligonucleotides : 

5« AATTCGAOX^GATTTCCTTCAATTTTTACTGCA 3' (SEQ ID No. 14) 
3' GCTACTCTAAAGGAAGTTAAAAATG 5* (SEQ ID No. 15). 

only the antisense strand of the oligonucleotide was 
kinase labelled so that the oligonucleotides did not 
polymerize at the 5 '-end. After acrylamide gel 
electrophoresis (10%), the fragment of 385 bp was 
15 isolated by electroelution . This £cfiRI-fiaffiHI fragment of 
385 bp was cloned into pA0202 ^ich had been cut with 
EcoRI and BamHI. Routinely, 5 ng of vector cut with the 
appropriate enzymes and treated with calf intestine 
alltaline phosphatase, was ligated with 50 ng of the 
20 insert fragment to produce plasmid pAO203- 
g. Construction of plasmid dA0804 

Plasmids pA0804 and pAOBOV (see Example li) are 
used in the construction of plasmid pA0815. 

Plasmid pA0804 has been described in PCT 
25 Application No. WO 89/04320. Construction of this 
plasmid involved the following steps: 

Plasmid pBR322 was modified to eliminate the 
EcoHI site and insert a Bglll site into the PvuII site as 
follows : 

30 PBR322 was digested with EcoRI, the protruding 

ends were filled in with Klenow Fragment of E. coli DNA 
polymerase I, and the resulting DNA was recircularized 
using T4 ligase. The recircularized DNA was used to 
transform E. coli MC1061 to ampicillin-resistance and 

35 transformants were screened for having a plasmid of about 
4.37 kpb in size without an EcoRI site. One such 
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transformant was selected and cultured to yield a 
plasmid, designated pBR322aRI, which is pBR322 with the 
EcoRI site replaced with the sequence (SEQ ID No. 16) : 

5»-GAATTAATTC-3» 
5 Plasmid pBR322aRI was digested with PvuII, and 

the linker having the sequence: 

5'-CAGATCrG-3' 

3I-GTCTAGAC-5* 

was ligated to the resulting blunt ends employing T4 

10 ligase. The resulting DNAs were recircularized, also 
with T4 ligase, and then digested with Bglll and again 
recircularized using T4 ligase to eliminate multiple 
Bglll sites due to ligation of more than one linker to 
the PvuII-cleaved pBR322aRI. The DNAs, treated to 

15 eliminate multiple Bglll sites, were used to transform E. 
coli MC1061 to ampicillin-resistance. Transformants were 
screened for a plasmid of about 4.38 kbp with a Bglll 
site. One such transformant was selected and cultured to 
yield a plasmid, designated pBR322aRIBGL, for further 

20 work. Plasmid pBR322ARIBGL is the same as pBR322 RI 
except that pBR322ARIBGL has the sequence (SEQ ID No. 17) 

5 • -CAGCAGATCTGCTG-3 • 
in place of the PvuII site in pBR322ARI. 

Plasmid pBR322ARIBGL was digested with a Sail 

25 and Bglll and the large fragment (approximately 2.97 kbp) 
was isolated. Plasmid pBSAGI5I, which is described in 
European Patent Application Publication No. 0 226 752, 
was digested completely with Bglll and XhoT and an 
approximately 850 bp fragment from a region of the P. 

30 pastoris AOXl locus downstream from the AOXl gene 

transcription terminator (relative to the direction of 
transcription from the AOXl promoter) was isolated. The 
Bglll-Xhol fragment from pBSAGI5I and the approximately 
2.97 kbp, Sall-Bglll fragment from pBR322ARIBGL were 

35 combined and subjected to ligation with T4 ligase. The 
ligation mixture was used to transform E. coli MC1061 to 
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ampicillin-resistance and transfonnants were screened for 
a plasmid of the expected size (approximately 3.8 kbp) 
with a Bglll site. This plasmid was designated pAOSOl. 
The overhanging end of the Sail site from the 
5 PBR322ARIBGL fragment was ligated to the overhanging end 
of the Xhol site on the 850 bp pBSAGISI fragment and, in 
the process, both the Sail site and the Xhol site in 
pAOSOl were eliminated. 

Plasmid pBSAGISI was then digested with Clal 

10 and the approximately 2.0 kbp fragment was isolated. The 
2.0 kbp fragment has an approximately 1.0-kbp segment 
which con«)rises the P. pastoris AOXl promoter and 
transcription initiation site, an approximately 700 bp 
segment encoding the hepatitis B virus surface antigen 

15 ("HBsAg") and an approximately 300 bp segment which 
comprises the P. pastoris AOXl gene polyadenylation 
signal and site-encoding segments and transcription 
terminator. The HBsAg coding segment of the 2.0 kbp 
fragment is terminated, at the end adjacent the 1.0 kbp 

20 segment with the AOXl promoter, with an EcoRI site and, 
at the end adjacent the 300 bp segment with the AOXl 
transcription terminator, with a StuI site, and has its 
subsegment which codes for HBsAg oriented and positioned, 
with respect to the 1.0 kbp promoter-containing and 300 

25 bp transcription terminator-containing segments, 
operatively for expression of the HBsAg upon 
transcription from the AOXl promoter. The EcoRI site 
joining the promoter segment to the HBsAg coding segment 
occurs just upstream (with respect to the direction of 

30 transcription from the AOXl promoter) from the 

translation initiation signal-encoding triplet of the 

AOXl promoter. 

For more details on the promoter and terminator 
segments of the 2.0 kbp, clal-site-terminated fragment of 
35 PBSAGI5I, see U.S. Pat. No. 4,895,800, European Patent 
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Application Publication No. 226,846 and Ellis et al . , 
W<?1, Q^ll Bio;. 5, 1111 (1985). 

Plasmid pAOSOl was cut with Clal and coiobined 
for ligation using T4 ligase with the approximately 2.0 
5 kbp Clal -site-terminated fragment from pBSAGISI. The 
ligation mixture was used to transform E- coli MC1061 to 
ampicillin resistance, and transformemts were screened 
for a plasmid of the expected size (approximately 5.8 
kbp) which, on digestion with Clal and Bglll, yielded 

10 fragments of about 2.32 Icbp (with the origin of 

replication and ampicillin-resistance gene from pHR322) 
and about 1.9 kbp, 1.48 kbp, and 100 bp. On digestion 
with Bglll and EcoRI, the plasmid yielded an 
approximately 2.48 kbp fragment with the 300 bp 

15 terminator segment from the AOXl gene and the HBsAg 

coding segment, a fragment of about 900 bp containing the 
segment from upstream of the AOXl protein encoding 
segment of the AOXl gene in the AOXl locus, and a 
fragment of about 2.42 kbp containing the origin of 

20 replication and ampicillin resistance gene from pBR322 
and an approximately 100 bp Clal-Bglll segment of the 
AOXl locus (further upstream from the AOXl-encoding 
segment than the first mentioned 900 bp EcoRI-Bglll 
segment) . Such a plasmid had the Clal fragment from 

25 pBSAGISI in the desired orientation, in the opposite 
undesired orientation, there would be EcoRI -Bglll 
fragments of about 3.3 kbp, 2.38 kbp and 900 bp. 

One of the trans formants harboring the desired 
plasmid, designated pA0802, was selected for further work 

30 and was cultured to yield that plasmid. The desired 

orientation of the Clal fragment from pBSAGI5I in pA0802 
had the AOXl gene in the AOXl locus oriented correctly to 
lead to the correct integration into the P. pastor is 
genome at the AOXl locus of linearized plasmid made by 

35 cutting at the Bglll site at the terminus of the 800 bp 
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fragment from downstream of the AOXl gene in the AOXl 
locus. 

pA0802 was then treated to remove the HBsAg 
coding segment terminated with an EcoRI site and a StuI 
5 site- The plasmid was digested with StuI and a linker of 

sequence: 

5'-GGAATTCC-3' 
3»-CCTTAAGG-5' 
was ligated to the blunt ends using T4 ligase. The 

10 mixture was then treated with EcoRI and again subjected 
to ligating using T4 ligase. The ligation mixture was 
then used to transform E. coli MC1061 to aii«>icillin 
resistance and transformants were screened for a plasmid 
of the expected size (5.1 kbp) with EcoRI-Bglll fragments 

15 of about 1.78 kbp, 900 bp, and 2.42 kbp and Bglll-Clal 
fragment of about 100 bp, 2.32 kbp, 1.48 kbp, and 1.2 
kbp. This plasmid was designated pA0803. A transformant 
with the desired plasmid was selected for further work 
and was cultured to yield pA0803. 

20 Plasmid pA0804 was then made from pA0803 by 

inserting, into the BamHI site from pBR322 in pA0803, an 
approximately 2.75 kbp Bglll fragment from the P. 
pastoris HIS4 gene. See, e.g., Cregg et ^1- ., JiQl- Cell- 
Biol, ^, 3376 (1985) and European Patent Application 

25 Publication Nos 180,899 and 188,677. pA0803 was digested 
with BamHI and combined with the HIS4 gene-containing 
Bglll site-terminated fragment and the mixture subjected 
to -ligation using T4 ligase. The ligation mixture was 
used to transform E. coli MC1061 to ampicillin-resistance 

30 and tr^sformants were screened for a plasmid of the 

expected size (7.85 kbp), which is cut by Sail. One such 
transformant was selected for further work, and the 
plasmid it harbors was designated pA0804. 

Plasmid pA0804 has one Sall-Clal fragment of 

35 about 1.5 kbp and another of abut 5.0 kbp and a Clal-Clal 
fragment of 1.3 kbp; this indicates that the direction of 
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transcription of the HIS4 gene in the plasmid is the same 
as the direction of transcription of the ampicillin 
resistance gene and opposite the direction of 
transcription from the AOXl promoter. 
5 The orientation of the HIS4 gene in pA0804 is 

not critical to the fvinction of the plasmid or of its 
derivatives with cDNA coding segments inserted at the 
EcoRI site between the AOXl promoter and terminator 
segments. Thus, a plasmid with the HIS4 gene in the 
10 orientation opposite that of the HIS4 gene in pA0804 
would also be effective for use in accordance with the 
present invention. 

h. Construction of plasmid PA0815 

Plasmid pA0815 was constructed by mutagenizing 
15 plasmid pA0807 to change the Clal site downstream of the 
AOXl transcription terminator in pA0807 to a BzunHI site. 
The oligonucleotide used for mutagenizing pA0807 had the 
following sequence (S£Q 10 No. 18) : 

5 • GAC GTT CGT TTG TGC GGA TCC AAT GCG GTA GTT TAT 3 • . 

20 The mutagenized plasmid was called pA0807-Bam. Plasmid 
pA0804 was digested with Bglll and 25 ng of the 2400 bp 
fragment were ligated to 250 ng of the 5400 bp Bglll 
fragment from Bglll-digested pA0807-Bam. The ligation 
mix was transformed into MC1061 cells and the correct 

25 construct was verified by digestion with Pst/BaimHI to 
identify 6100 and 2100 bp sized bands. The correct 
construct was called pA0815 (See Figure 1) . 

i. Construction of plasmid pA0807 
1. Preparation of fl-ori DNA 

30 fl bacteriophage DNA (50 ^g) was digested with 

50 units of Rsa I and Dra I (according to manufacturer's 
directions) to release the ^458 bp DNA fragment 
containing the fl origin of replication (ori) . The 
digestion mixture was extracted with an equal volume of 

35 phenol: chloroform (V/V) followed by extracting the 
aqueous layer with an equal volume of chloroform and 
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finally the DNA in the aqueous phase was precipitated by 
adjusting the NaCl concentration to 0.2M and adding 2.5 
volumes of absolute ethanol. The mixture was allowed to 
stand on ice {4'C) for 10 minutes and the DNA precipitate 
5 was collected by centrifugation for 30 minutes at 10,000 
X g in a microfuge at 4'C. 

The DNA pellet was washed 2 times with 70% 
aqueous ethanol. The washed pellet was vacuum dried and 
dissolved in 25 ^1 of TE buffer. This DNA was 
10 electrophoresed on 1.5% agarose gel and the gel portion 
containing the ft*458 bp fl-ori fragment was excised out 
and the DNA in the gel was electroeluted onto DE81 
(Watman) paper and eluted from the paper in IM NaCl. The 
DNA solution was precipitated as detailed above and the 
15 DNA precipitate was dissolved in 25 ^1 of TE buffer (fi- 
eri fragment) . 

2. Cloning of fl-ori into Dra I sites of 
pBR322 

PBR322 (2 fig) was partially digested with 2 
20 units Dra I (according to manufacturer's instructions). 
The reaction was terminated by phenol : chloroform 
extraction followed by precipitation of DNA as detailed 
in step 1 above. The DNA pellet was dissolved in 20 m1 
of TE buffer. About 100 ng of this DNA was ligated with 
25 100 ng of fl-ori fragment (step 1) in 20 /il of ligation 
buffer by incubating at 14 'C for overnight with 1 unit of 
T4 DNA ligase. The ligation was terminated by heating to 
70*-C for 10 minutes and then used to transform E. coli 
strain JM103. Amp** transformants were pooled and 
30 super infected with helper phage R408. Single stranded 
phage were isolated from the media and used to reinfect 
JM103. Amp"* transformants contained pBRfl-ori which 
contains fl-ori cloned into the Dra I sites (nucleotide 
positions 3232 and 3251) of pBR322. 
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3. Construction of plasmid pA0807 
pBRfl-ori (10 M?) was digested for 4 hours at 
37 "C with 10 units each of Pst I and Nde I. The digested 
DNA was phenol: chloroform extracted, precipitated and 
5 dissolved in 25 ^1 of TE buffer as detailed in step 1 
above. This material was electrophoresed on a 1*2% 
agarose gel and the Nde 1 - Pst l fragment (approximately 
0.8 ]cb} containing the fl-ori was isolated and dissolved 
in 20 Ml of TE buffer as detailed in step 1 above. About 

10 100 ng of this DNA was mixed with 100 ng of pA0804 that 
had been digested with Pst 1 and Nde I euid phosphatase- 
treated. This mixture was ligated in 20 fil of ligation 
buffer by incubating for overnight at 14 'C with 1 unit of 
T4 DNA ligase. The ligation reaction was terminated by 

15 heating at 70 *C for 10 minutes. This DNA was used to 
transform £. coll strain JM103 to obtain pA0807. 

Expression vector pA0856 is the same as vector 
pA0815 except for the following differences: 

20 1. The Beun HI site at the 3' end of the 

transcription termination region in pA0815 is changed to 
a Hin dlll -BamH I double site in pA085e. This modification 
allows each gene in a multi-copy strain, developed using 
the pA0856 parent vector, to be isolated on a Hin dlll 

25 fragment for independent sequencing during verification. 

2. The EcoR V site in the pBR322 region between the 
transcription terminator and the HIS4 gene has been 
deleted in pA0856; thus, pA0856 contains a single EcoR V 
site in the AOXl 3' region. Thus, when a pA0856-based 

30 expression vector is integrated into the AOXl 3* region 
(see #3, below), the entire vector can be recovered on an 
EcoR V fragment. This will be useful for Southern 
analyses and strain verif ication- 

3 - A Not I site has been added to the 3 ' AOXl 

35 region in pA0856, A pA0856-based expression vector 
linearized with Not I can be used for site-directed 
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integration into the AOXl 3' region. In contrast, multi- 
copy vectors based on pA0815 can be directed into the 
HIS4 locus only. 

4. The Bglll site at the 3' end of the ^Q2a 3' 
5 region has been deleted in pA0856. Multi-copy vectors 
constructed with pA0815 are generated by isolating the 
expression cassette on a figill-fiaffiHI fragment and 
inserting this fragment back into the BaJftHI site as a 
tandem-repeat unit. There are two problems associated 

10 with this approach. First, the figlll-fiSfflHI fragment can 
insert in either orientation; thus, approximately half of 
the insertions are incorrect and the number of 
transformants which have to be screened by mini preps and 
enzyme digestion is increased. Second, inverted-repeat 

15 units usually lead to recombination events and, as a 
result, transformants are more difficult to screen 
because there is a wide variety of digestion patterns. 
In pA0856-based vectors, the expression cassette can be 
isolated on a Clal-BamHI fragment and inserted into the 

20 Clal -Bal ll sites. The fragment can insert in only one 
orientation, thus reducing the number of transformants 
which need to be screened and increasing the ease of 
screening . 

Additionally, in pA0815-based vectors, the figlH- 
25 B^^TTiTTT digest also liberates an approximately 2400-bp 
Balll fragment which contains the E. coli ori and amp 
resistance gene. Any agarose gel-isolated expression 
cassette of approximately 2400-bp or longer inevitably 
also contains the 2400-bp Bgill fragment (due to the 
30 limitations of gel separations) . The BaiH fragment can 
self-ligate and transform the host. In fact, because the 
Balll plasmid is so small, it is a "preferred" 
transforming fragment, and even a seemingly miniscule 
amount in a fragment prep leads to a large number of 
35 incorrect transformants. In pA0856-based vectors, the 
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ori and amp gene are contained on a BamHI-Clal fragment 
which does not easily self-ligate. 
Vector Constructio n of pA0856 

1. Removal o f EcoRV site 
5 The approximately 2000-bp EcoRI-Sall 

fragment from pA0804 (described in Example If) containing 
the transcription termination region, 350 bp of 

pBR322, and the 3* half of the HIS4 gene was inserted 
into M13mpl9. Transformants were screened with fiindlll 
10 digests. Positive trjmsformants exhibited bands of about 
1900 and 7500 bp and were called pHIS102. 

Three bases in the EcoRV site located in 
the pBR322 segment immediately following the AOXl 
transcription termination region were deleted in pHIS102 
15 by site-directed mutagenesis. The deletion effected 

removal of the recognition site. The oligonucleotides 
were as follows: 

Mutagenesis oligo (SEQ ID No. 19) : 
GGCCTCTTGCGGGATGTCCATTCCGACAGC 
20 Screening oligo (SEQ ID No. 20) : 
TTGCGGGATGTCCATTCC 

The mutagenized plasmid, pHISlOB, was sequenced to verify 

the deletion. 

2. Alter Hindlll-Clal-Hi ndlll site to a 
25 HindlTl-B amHI sjtf> 

Plasmid pHISlOB was used to change the 

SiDdIII-£laI-HindIII cluster of restriction sites at the 

3' end of the ^0X1 transcription termination region to 

liindlll-BaffiHI sites by site-directed mutagenesis. 
30 Mutagenizing oligo (SEQ ID No. 21) : 

TTTGTGCAAGCTTATGGATCCGCTTTAATGCGGTAGT 

Screening oligo (SEQ ID No. 22) : 

TTATGGATCCGCTTT 

The mutagenized plasmid, pHIS103M, was sequenced to 
35 verify the changes. The EcoRI-Sall fragment from 
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PHIS103M was used in the construction of pA0856, which 
follows below (see part 4) . 

3. Insert Baip HT; site in HIS4 gene an4 ftt 3', 

end of 3' AOX s e quences; Ipsert NotI site 
5 in 3' A0X3 secruence 

Vector pAO804B is similar to pAO804 except 
that the filSl gene in pAO804B contains a SsaftHI site %rtxich 
was deleted from the HIS4 gene in pA0804. In order to 
construct pAO804B, the approximately 2700-bp figill 
10 fragment, which is the HIS4 gene, from pYJ8 (NRRL 8- 
15889) was inserted into the fiasHI site of pAO803 (see 
Example Ig) . Correct transf ormants displayed 
approximately 3000- and 5000-bp fragments upon digestion 
with Pscaii- 

15 The approximately 1400-bp fiasHI-figlll 

fragment from pA0804B, comprising the AQXl 3' targeting 
sequences and the first half of the PIg4 gene, was 
isolated and inserted into the BamHI site of Ml3mpl9. 
Transformants with the desired orientation displayed 

20 fragments of approximately 1300- and 7400-bp upon double- 
digestion with EcoR V and Sail- 

Single-stranded plasmid pHISB04 was used 
as template for site-directed mutagenesis to insert a 
BamHI site at the 3' end of the AOXl 3' region. 

25 Mutagenizing oligo (SEQ ID No. 23) : 

TTCGAGCTCGGTACCTAAGGATCCTGAGATAAATTTCA 

Screening oligo (SEQ ID No. 24): TACCTAAGGATCCTGAG 

Th6 mutagenized plasmid, pHIS804Ml, was sequenced to 

verify the change. 
30 Plasmid pHIS804Ml was used as a template 

for site-directed mutagenesis to insert the sequence for 
a Not I site into the 3' AOXl region. 
Mutagenizing oligo (SEQ ID No. 25) : 
ACGTTGTCACTGAAGGCGGCCGCAGTATCTACAAACC 

35 Screening oligo (SEQ ID No. 26): 
TGAAGGCGGCCGCAGT 
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The mutagenized plasmid, pHIS804M2, was sequenced to 
verify the change. 

4. Assembly of PA0856 

The approximately 1450-bp MfflHI fragment 
5 from PHIS804M2, comprising the first half of the HIS4. 
gene and the 3' AQXl sequence, was isolated and inserted 
into the BamHI and figill sites of pBR322ARIBGL. 
Transfonnants with the correct orientation displayed 
fragments of approximately 2500- and 1600-bp upon 

10 digestion with fisfiRV, and were called pA0851. 

The approximately 3000-bp £SQlII fragment 
from pAO804 [Example Ig] , encoding the second half of the 
AOXl terminator and the 3' HIS4 gene, was isolated and 
inserted into the PvuII site of pA0851. Transfonnants 

15 with the correct orientation displayed fragments of 
approximately 650- and 6500-bp upon digestion with 
fiindlll and approximately 2000- and 5500-bp upon doxible- 
digestion with BamHI and Sail, and were called pA0852. 

The approximately 3000-bp £laI-SalI 

20 fragment from pA0815 (Example Ih) , encoding the 5» AQJil 
sequence, the AOXl terminator, and the 3' end of the ffIS4 
gene, was isolated and inserted into the Cla l and Sail 
sites of pA0852- Transformants with the correct 
orientation were linearized with Bal ll and displayed 

25 fragments of approximately 3000- and 4700-bp upon double- 
digestion with Cla l and Sail. Positive transformants 
were called pA0855. 

The approximately 2000-bp EcoR I-Sall 
fragment from pHIS103M, encoding the AOXl terminator and 

30 the 3* end of HIS4 . was isolated and inserted into the 
EcoRI and Sai l sites of pA0855. The resulting vector 
pA0856 (see Figure 2) was linearized with Eco RV and 
displayed fragments of approximately 430- and 7300-bp 
upon digestion with pindlll. 
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Example 2 

Development of HSA-secreting strains 

Expression vectors pHSAlll, pHSA2ll, pHSA212, 
PHSA214, and pHSA216 were used to develop Mut* strains of 
5 Pichia pastoris for the expression of HSA. The phenotype 
Mut* refers to E^thanol ijtilization normal. 
Transformation of host strain GS115, a his4- mutant of 
Pichia pastoris deposited at ATCC under #20864, was 
accoB^lished by the spheroplast method [Cregg, J.M-, 
10 Barringer, K.J., Hessler, A.Y-, and Madden, K.R. 

Mol , cell > Biol - 5:3376 (1985); see also US 4,879,231] 

The Kut* strains were generated by integration 
of the entire expression vector into the EtSl locus by 
additive homologous recombination. For site-directed 
15 addition into the HIS4 locus, the expression vector was 
linearized by digestion with StUl, which cleaves the 
plasmid within the HIS4 region. In this additive 
integration, the AOXl locus is undistiirbed , and the 
transformants retain the Mut* phenotype of wild-type 

20 growth on methanol. 

<;;y >aracterizatiQn of the Stra:ips by Southern 

Blot Analysis 

Mut* transformants resulting from 
integration by addition of expression vectors pHSAlll, 

25 PHSA211, PHSA212, pHSA214, and pHSA216 to the 1IS4 locus 
of GS115 were initially screened for histidine 
prototrophy. DNA from His* transformants was analyzed by 
Southern blot hybridization to verify the site of 
integration of the plasmid and the number of copies that 

30 had integrated. Thus, chromosomal DNA was digested 
separately with Bglll and Stul. Two sets of Bglll 
digests were probed with pBR322-based plasmids containing 
either the 5» and 3* regions of the pictiia AOXl gene, or 
the Pichia pastoris HIS4 gene. The Stul digests were 

35 probed with plasmid pMET-HSA (see Example I) . In 

addition, chromosomal DNA from transformants resulting 
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from integration of plasmids pHSA214 and pHSA216 was 
double-digested with £1^1 and BamH I and probed with the 
pBR322-based plasmid containing the AOXl 5' and 3' 
regions . 

Based on the results of Southern blot 
analysis of DNA from strains generated by transformation 
of GS115 with the five types of expression vectors the 
following strains were chosen for further 
characterization : 



10 







Site of 


Vector 


Cassette 


strain NUBS 


ve<?t9r 


integration 






G+HSA111S4 


pHSAlll 


HIg4 


1 


1 


G+HSA111S6 


pHSAlll 


HIS4 


1 


1 


G+HSA211S4 


PHSA211 


HIS4 


1 


1 


G+HSA211S6 


PHSA211 


HI54 


1 


1 


G+HSA212S31 


PHSA212 


HIS4 


1 


2 


G+HSA212S32 


pHSA212 


HIS4 


1 


2 


G+HSA214S34 


PHSA214 


HIS4 


1 


4 


G+HSA214S40 


PHSA214 




1 


4 


G+HSA214S51 


PHSA214 


HISl 


1 


4 


G+HSA216S47 


PHSA216 


HIS4 


1 


6 


G+HSA216S56 


PHSA216 


HIS4 


1 


6 


G+HSA216S44 


PHSA216 


HIS 4 


1 


6 



Example 3 



25 Fermentation of HSA strains 

HSA-expressing strains of Pichia pastoris were 
grown in one-liter fermentations to evaluate and compare the 
long-term growth and HSA production characteristics of the 
strains. Because Pichia pastoris achieves such high cell 

30 densities during extended growth periods, the cell mass 
accounts for as much as 50% of the total volume of the 
fermentor at the conclusion of the fermentation. Therefore, 
one-liter fermentations of Pichia pastoris are conducted in 
two-liter fermentors and typically yield 700-1000 ml of 

35 cell-free broth. 
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The protocols for fermentations of HSA-secreting 
strains of Pichia oastoris consist of three separate phases: 

1) growth on excess glycerol, 

2) growth on limited glycerol, and 
5 3) growth on limited methanol. 

Cells are initially grown on glycerol in a batch 
mode. Because glycerol strongly represses the AQXl 
promoter, the HSA gene, which is regulated by this promoter, 
is not expressed during this phase. Following exhaustion of 

10 the glycerol, a limited glycerol feed is initiated. 

Glycerol does not accumulate during this phase, but cell 
mass increases, and the AOXl promoter is depressed. 
Finally, in the third phase, a methanol feed is initiated 
which fully induces the AOXl promoter for the production of 

15 HSA. Three variations of this basic protocol were used to 
evaluate HSA-expressing strains. The variable that was 
altered in the three protocols was the amount of glycerol in 
the batch and fed-batch phases, 

1. T^v cell density ff >T-mentation protOCOl 

20 Runs 861 & 869: G+HSA111S4 (1 HSA-HSA expression 

cassette) 

Run 868: G+HSA111S6 (1 HSA-HSA expression 
cassette) 

Run 865: G+HSA212S31 (2 AMF-HSA expression 
25 cassettes) 

Run 866: G+HSA211S10 (1 AMF-HSA expression 

cassette) 

The following describes a three-stage fermentation 
carried out with a 2% glycerol batch/40 ml/hr 50% glycerol 
30 fed batch, 6 ml/hr MeOH feed rate. 

The fermentor was autoclaved with 500 ml lOX Basal 
Salts (52 ml/1 85% phosphoric acid, 1.8 g/1 Calcium 
Sulphate-2H20, 28.6 g/1 Potassium Sulfate, 23.4 g/1 
Magnesium Sulfate-7H20, 6.5 g/1 Potassium Hydroxide), 20 g 
35 glycerol and water added to a one-liter volume. After 
sterilization and cooling, 4 ml of YTM^ trace salts (5.0 
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ml/1 Sulfuric Acid, 65.0 g/1 Ferrous Sulfate-THjO, 6.0 g/1 
Copper Sulfate-SHgO, 20.0 g/1 zinc Sulfate-7H20, 3.0 g/1 
Manganese Sulfate-HjO, 0.1 g/1 Biotin) were added to the 
fermentation and the pH was adjusted to 5.0 with 50% 
5 anononium hydroxide containing 0.2% Struktol J673 antifoam. 
The pH of the medium was maintained at 5 by addition of the 
same solution. Excessive foaming was controlled by the 
addition of a 5% solution of Struktol J673 solution. 
Temperature was maintained at 30 and dissolved oxygen was 
10 maintained above 20% saturation by increasing agitation, 
concentration, aeration and supplementing with oxygen when 
needed. Inocula were prepared from cells grown overnight in 
buffered YNB (11.5 g/L KHgPO^, 2.66 g/L lyiPO^, 6.7 g/L yeast 
nitrogen base, pH 6) containing 2% glycerol. The fermentor 
X5 was inoculated with 40-100 ml of the cultured cells which 
had grown to an OD^ of 1-8, and the batch growth regimen 
was continued for 18 to 24 hours. At the point of glycerol 
exhaustion, indicated by an increase in dissolved oxygen 
concentration, a glycerol feed (50% glycerol plus 12 ml/L 
20 PTM,) was initiated at 10 ml/h. After four hours of 
glycerol feeding, the feed was terminated and a 100% 
methanol feed containing 12 ml/L PTM, was initiated at a 
feed rate of 2 ml/h. After three hours, the methanol feed 
was increased to 6 ml/h and maintained for greater than 72 
25 hours of methanol. 

2- Moderat e cell density fermentation protocol 

Run 876 and 877: G+HSA212S31 (2 AMF-HSA expression 
cassettes) 

The following protocol describes a three-stage 
30 fermentation carried out with a 5% glycerol batch/100 ml 50% 
glycerol fed batch, 6 ml/h MeOH feed rate. 

This protocol was identical to the low cell 
density protocol except that the amount of glycerol in the 
batch phase was increased from 20 to 50 grams and the 
35 glycerol feed rate was increased to 20 ml/h and run for 5 
hours. Therefore, the total volume of glycerol feed added 
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to the fermentation was increased from 40 to 100 ml. The 
remainder of the fermentation protocol was followed as 
described above, except that the pH of Run 877 was 
maintained at 6,5 instead of 5.0. 
5 3. High cell density f ermentation protocol 

Runs 890 & 891: G+HSA111S6 (1 HSA-HSA expression 

cassette) 

Runs 878 & 879: G+HSA212S31 (2 AMF-HSA expression 
cassettes) 

10 The following protocol describes a three-stage 

fermentation carried out with a 5% glycerol batcdi/200 ml50% 
glycerol fed batch, 6 ml/h MeOH feed rate. 

This protocol differed from the previous protocols 
in that the initial batch phase contained 50 g glycerol, 500 

15 ml lOX Basal Salts, and water added to a volume of 0.9 

liters; reduced from a volume of one liter. The volume of 
glycerol feed added during the glycerol fed-batch phase was 
increased to 200 ml by starting with a feed rate of 20 ml/h 
and increasing to 30 ml/h after 2 hours, and then 35 ml/h 

20 after 2 more hours. After a total seven hours of glycerol 
feeding, (and 200 ml volume added) , the glycerol feed was 
terminated and the methanol feed initiated at 2 ml/h and the 
protocol was followed as described above. The methanol feed 
rate in Run 879 was 9 ml/hr instead of 6 ml/hr. 

25 Methods of Monito i-inq the FeT-men tat ions 

The levels of the NH^OH, antifoam, glycerol and 
methanol reservoirs were recorded at time points. The wet 
weight of the culture was also determined as an indicator of 
cell growth in the fermentor. For this purpose, several 

30 one-ml aliquots of the fermentor culture was centrifuged for 
four minutes in a microfuge, the supernatant was decanted, 
and the wet pellet was weighted. Methanol and ethanol 
concentrations in the supernatant were determined by gas 
chromatography using a PorapakQ column. In addition, four 

35 cell pellets of 175 mg each and five supematants of 1 ml 
each were prepared from each 15-ml culture aliquot and 
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frozen for future analysis (i.e. , inununoassay, 
electrophoretic analysis, and inununoblot analysis) . 
Fermentation Results 

The data for the above-described fermentations are 
5 sumarized below: 







Cassette 


Hours 


Cell 


HSA 


Run 
861 




CoDv Number on MeOH 


vie! d 


R*»eT'eted 


G-t-HSAlllS4 


1 


77 


268 


164 


869 


6+HSA111S4 


1 


76 


305 


149 


868 


G+HSA111S6 


1 


76 


278 


247 


890 


G+HSA111S6 


1 


89 


306 


335 


891 


G+HSA111S6 


1 


89 


324 


356 


865 


G+HSA212S31 


2 


76 


271 


764 


866 


G+HSA211S10 


1 


76 


267 


421 


876 


6+HSA212S31 


2 


89 


329 


1175 


877 


G+HSA212S31 


2 


89 


301 


E-aa 


878 


6+HSA212S31 


2 


87 


327 


1418 


879 


G+HSA212S31 


2 


65 


388 


1401 



20 EXAMPI^ 4. CHARACTERIZATION OF FERMENTATI ON PRODUCTS 

A. HSA ELISA Protocol 
1. ELISA 

a. Reagents 

Authentic HSA standard and goat anti-HSA 
25 antibody conjugated to horseradish peroxidase were purchased 
from Organon Teknika Corporation (Durham, NC) . Authentic 
HSA was purchased in lyophilized form from organon Teknika 
Corporation (Durham, NC) and reconstitutec* with distilled 
water. The extinction coefficient at 280 nm of a 1% 
30 solution is 5.3 and was used to determine the concentration 
of the standard. 

Goat anti-HSA antibody was obtained from 
Atlantic Antibodies (Scarborough, ME) . The horseradish 
peroxidase substrate, o-phenylenedieimine (OPD) , was 
35 purchased from Sigma Chemicals (St. Louis, MO) . All other 
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chemicals were reagent grade and obtained from general 
suppliers. 

b. Method 

The ELISA used for quantitation of HSA in 
5 cell-free fermentor broth is a double-antibody assay in 
which the HSA molecule is sandwiched between an anti-HSA 
antibody coated on a 96-well microliter plate and an anti- 
HSA antibody conjugated to horseradish peroxidase. 

In a typical assay, 200 fil of goat anti-human 
10 HSA antibody diluted 1:500 with a carbonate coating buffer 
(15 mM NajCOj; 35 mM NaHCOj, pH 9.5) is added to each well of 
a 96-well microliter plate and incubated 60 minutes at 37 'C. 
After three washes with Tris-buf fered saline (TBST: 10 mM 
Tris-HCl, pH 7.5; 150 mM NaCl; 0.05% Tween-20) , the plate is 
15 incubated with 200 /il of "Blotto" buffer (0.003% antifoam-A; 
0.1% thimerasol; 5.0% nonfat dry milk; IX PBS, pH 7-5; 0.05% 
Tween-20) overnight at 37 "C to prevent nonspecific binding 
of subsequent reagents. The plate is washed as before, and 
aliquots of standard HSA or diluted unknown sample are added 
20 to the wells and incvibated for 2 hours at 37 'C. After the 
plate is washed as before, 200 Ml of goat anti-human HSA 
antibody conjugated to horseradish peroxidase is added, and 
the plate is incubated for 2 hours at room temperature. The 
plate is then washed as before, and 200 fil of the substrate 
25 solution (10 mg of OPD in 0.0125% H202) is added. After a 
15 minute incubation at room temperature in the dark, the 
reaction is terminated by the addition of 50 Ml 4.5 N H^SO^ 
and the absorbance at 492 nm is measured. 

The assay has a sensitivity of 0.1 ng and is 
30 linear to 1.0 ng. 

2. SDS-PAGE 

a. Sample preparation 

Cell-free broth was examined by SDS-PAGE 
and immunoblotting. Cell-free broth samples were prepared 
35 by centrifuging fermentor cultures at 6500 x g for 10 

minutes and decanting the broth. The broth was then diluted 
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two-fold with 2X Laemmli sample buffer (Nature il7:680, 
1970) (0.125 M Tris-HCl, pH 6.8; 4% SDS; 20% glycerol; 
0.005% bromophenol blue; 200 mM DTT) and boiled for 5 
minutes. The sample buffer used for the reduced stained 
5 SDS-PAGE and immxxnoblot contained 20 mM and 200 mM 
dithiothreitol (DTT), respectively. 

b. SPS-PAqE 

SDS-PAGE analyses were performed essentially 
as described by Laemmli using a 10% polyacrylamide gel with 
10 a 4% polyacrylamide stacking gel. The electrophoreses were 
carried out on Mini-Protean gel apparatus (BioRad) . 

c. Coomassie Staining 

Protein visualization by Coomassie staining 
was conducted by staining the SDS-PAGE gels for 30 minutes 

15 in 50% ethanol, 10% acetic acid, 5% TCA, and 200 ag/L 
Coomassie Brill izmt Blue. The gels were rehydrated by 
incubation for 30 minutes in 10% ethanol, 10% acetic acid, 
1% TCA, 50 mg/L Coomassie Brilliant Blue, and then destained 
in 10% ethanol and 10% acetic acid. 

20 3* Characterization of HSA 

Protein staining and immunoblot analysis of SDS- 
PAGE gels of reduced and non-reduced cell-free broth from 
fermentations revealed the presence of a predominant protein 
that migrated to the seme position as authentic HSA (jL-fi. , 

25 65 kD non-reduced, 70 kD reduced). A smaller species ("30 
kD non-reduced, "32 kD reduced) also was seen in broth from 
some of the strains. An '50 kD band was also detected in 
the reduced broth samples for all fermentations. A similar, 
if not identical, approximately 45 kD species has been 

30 detected in the broth of recombinant HSA produced in S, 
cerevisiae (Sleep, et al^ Bio/Technology 8:42, 1990). A 
small band ("32 kD non-reduced, '30 kD reduced) was detected 
in the broth from some of the strains. 

A higher molecular weight species of >106 kD which 

35 is also methanol induction time dependent was detected only 
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in the non-reduced iBununoblot of the strains, suggesting the 
formation of HSA multimeters through disulfides. 

The invention has been described in detail with 
reference to particular embodiments thereof. It will be 
5 understood, however, that variations and modifications can 
be effected within the spirit and scope of the invention. 
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GAC GCT CAC AAG TCT GM GTC GCT CAC AGA TTC AAG GAT CTA GGT GAA 48 
Asp Ala Hts Lys Ser GLu Val Ala His Arg Phe Lys Asp Leu Gly Glu 
15 10 15 

GAA AAC TTC AAC GCT TTG GU HG AH GCT TTC GCT CAA TAC TTG CAA 96 
Glu Asn Phe Lys Ala Leu Val Leu lie Ala Phe Ala Gin Tyr Leu Gin 

20 25 30 

CAA TGT CCA TTC GAA GAC CAC GTC AAG TTG GTC AAC GAA GTT ACT GAA 14A 
Gin Cys Pro Phe Gtu Asp His Val Lys Leu Val Asn Glu Val Thr Glu 
35 40 45 

in GCT AAG ACC TGT GTT GCT CAC GAA TCT GCT GAA AAC TGT GAC AAG 192 
Phe Ala Lys Thr Cys Val Ala Asp Glu Ser Ala Glu Asn Cys Asp Lys 
50 55 60 

TCC TTG CAC ACT TTG TTC GGT GAC AAG TTG TGT ACT GTT GCT ACT TTG 240 
Ser Leu His Thr Leu Phe Gly Asp Lys Leu Cys Thr Val Ala Thr Leu 
65 70 75 80 

AGA GAA ACT TAC GGT GAA ATG GCT GAC TGT TGT GCT AAA CAG GAA CCA 288 
Arg Glu Thr Tyr Gly Glu Het Ala Asp Cys Cys Ala Lys Gin Glu Pro 
85 90 95 

GAA AGA AAC GAA TGT TTC TTA CAA CAC AAG GAC GAC AAC CCA AAC TTG 336 

Glu Arg Asn Glu Cys Phe Leu Gin His Lys Asp Asp Asn Pro Asn Leu 
100 105 110 

CCA AGA TTG GTT AGA CCA GAA CTC GAC CTT ATG TGT ACT GCT TTC CAC 384 
Pro Arg Leu Val Arg Pro Glu Val Asp Val Net Cys Thr Ala Phe His 
115 120 125 

GAC AAC GAA GAG ACT TTC TTG AAG AAG TAC HG TAC GAA ATC GCC AGA 432 
Asp Asn Glu Glu Thr Phe Leu Lys Lys Tyr Leu Tyr Glu He Ala Arg 
130 135 140 

AGA CAC CCA TAC TTC TAC GCT CCA GAA HG TTG TTC TTC GCT AAG AGA 480 
Arg His Pro Tyr Phe Tyr Ala Pro Glu Leu Leu Phe Phe Ala Lys Arg 
145 150 155 160 

TAC AAG GCT GCT TTC ACT GAA TGT TGT CAA GCT GCC GAC AAG GCT GCT 528 
Tyr Lys Ala Ala Phe Thr Glu Cys Cys Gin Ala Ala Asp Lys Ala Ala 
165 1 70 175 

TGT TTG TTG CCA AAG TTG GAC GAA TTG AGA GAC GAA GGT AAG GCT TCT 576 
Cys Leu Leu Pro Lys Leu Asp Gtu Leu Arg Asp Glu Gly Lys Ala Ser 
180 185 190 

TCC GCT AAG CAA AGA TTG AAG TGT GCT TCC TTG CAA AAG TTC GGT GAA 624 
Ser Ala Lys Gin Arg Leu Lys Cys Ala Ser Leu Gin Lys Phe Gly Glu 
195 200 205 

AGA GCC TTC AAG GCC TGG GCT GTT GCT AGA TTG TCT CAA AGA JIC CCA 672 
Arg Ala Phe Lys Ala Trp Ala Val Ala Arg Leu Ser Gin Arg Phe Pro 

210 215 220 

AAG GCT GAA TH GCT GAA GTT TCT AAG TTG GTT ACT GAC HG ACT AAG 720 
Lys Ala Glu Phe Ala Glu Val Ser Lys Leu Val Thr Asp Leu Thr Lys 
225 230 235 240 

GTT CAC ACT GAA TGT TGT CAC GGT GAC TTG TTG GAA TGT GCT GAC GAC 768 
Val His Thr Glu Cys Cys His Gly Asp Leu Leu Glu Cys Ala Asp Asp 
245 250 255 

AGA GCT GAC TTG GCT AAG TAT ATC TGT GAA AAC CAA GAC TCT ATC TCT 816 
Arg Ala Asp Leu Ala Lys Tyr He Cys Glu Asn Gin Asp Ser lie Ser 
260 265 270 

TCT AAG TTG AAG GAA TGT TGT GAA AAG CCA TTG TTG GAA AAG TCT CAC 864 
Ser Lys Leu Lys Glu Cys Cys Glu Lys Pro Leu Leu Glu Lys Ser His 
275 280 285 

TGT ATC GCT GAA Gn GAA AAC GAC GAA ATG CCA GCT GAC TTG CCA TCT 912 
Cys lie Ala Glu Val GLu Asn Asp Glu Met Pro Ala Asp Leu Pro Ser 

290 295 300 
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TTG GCT GCT GAC TTC GTT GAA TCT AA6 GAC GTT TGT AAG AAC TAC GCT 960 
Leu Ala Ala Asp Phe Val Glu Ser Lys Asp Val Cys Lys Asn Tyr Ala 
305 310 315 320 

GAA GCT AAG GAC GTT TTC TTG GGT ATG TTC TTG TAC GAA TAC GCT A6A 1008 
Glu Ala Lys Asp Vsl Phe Leu Gly Met Phe Leu Tyr Glu Tyr Ala Arg 
325 330 335 

AGA CAC CCA GAC TAC TCC GTT GTT TTG TTG TTG A6A TTC GCT AAG ACT 1056 
Arg His Pro Asp Tyr Ser Val Val Leu Leu Leu Arg Leu Ala Lys Thr 
340 345 350 

TAC GAA Aa ACT TTG GAA AAG TGT TGT GCT GCT GCT GAC CCA CAC GAA 1104 
Tyr Glu Thr Thr Leu Glu Lys Cys Cys Ala Ala Ala Asp Pro His Glu 
355 360 365 

TGT TAC GCT AAG GH TTC GAC GAA TTT AAG CCA TTG GTT GAA GAA CCA 1152 
Cys Tyr Ala Lys Val Phe Asp Glu Phe Lys Pro Leu Val Glu Glu Pro 
370 375 380 

CAA AAC TTG ATT AAG GAA AAC TGT GAA TTG TTC AAG CAA TTG GGT GAA 1200 
Gin Asn Leu lie Lys Gin Asn Cys Glu Leu Phe Lys Gin Leu Gly Glu 
3S5 390 395 400 

TAC AAG TTC CAA AAC GCT TTG TTG GTT AGA TAC ACT AAG AAG GTT CCA 1248 
Tyr Lys Phe Gin Asn Ala Leu Leu Val Arg Tyr Thr Lys Lys Val Pro 
405 410 415 

CAA GTC TCC AQ CCA ACT TTG GTT GAA GTC TCT AGA AAC TTG GGT AAG 1296 
Gin Val Ser Thr Pro Thr Leu Val Glu Val Ser Arg Asn Leu Gly Lys 
420 425 430 

GH GGT TCT AAG TGT TGT AAG CAC CCA GAA GCT AAG AGA ATG CCA TGT 1344 
Val Gly Ser Lys Cys Cys Lys His Pro Glu Ala Lys Arg Met Pro Cys 
435 440 445 

GCT GAA GAC TAC HG TCT GTT GTT TTG AAC CAA TTA TGT GTT TTG CAC 1392 
Ala Glu Asp Tyr Leu Ser Val Val Leu Asn Gin Leu Cys Val Leu His 
450 455 460 

GAA AAG ACT CCA GH TCT GAC AGA GTT ACT AAG TGT TGT ACT GAA TCT 1440 
Glu Lys Thr Pro Val Ser Asp Arg Val Thr Lys Cys Cys Thr Glu Ser 
465 470 475 480 

TTG GTT AAC AGA AGA CCA TGT TTC TCT GCC TTG GAA GTT GAC GAA ACT 1488 
Leu Val Asn Arg Arg Pro Cys Phe Ser Ala Leu Glu Val Asp Glu Thr 
485 490 495 

TAC GTC CCA AAG GAA TTT AAC GCT GAA ACT TTC ACT TTC CAC GCC GAC 1536 
Tyr Val Pro Lys Glu Phe Asn Ala Glu Thr Phe Thr Phe His Ala Asp 

500 505 510 
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ATC TGT ACC TTG TCC GAA AAG GAA AGA CAA ATC AAG AA6 CAA ACT GCT 1584 
lie Cys Thr Leu Ser Glu Lys Glu Arg Gin He Lys Lys Gin Thr Ala 

r-ar- CM 525 



1632 



1680 



515 520 525 

TTG GTT GAA TTG GTT AAG CAE AAG CCA AAG GCT ACT AAG GAA CAA TTG 
Leu Val Glu Leu Val Lys His Lys Pro Lys Ala Thr Lys Glu Gin Leu 
530 535 540 

AAG CCT GTT ATG GAC GAC TTC GCT GCT TTC GTT GAA AAG TGT TGT AAG 
Lys Ala Val Met Asp Asp Phe Ala Ala Phe Val Glu Lys Cys Cys Lys 
545 550 555 560 

GCT GAC GAC AAG GAA ACT TGT TTC GCT GAA GAA GGT AAG AAG HG GTT 1728 
ALa Asp Asp Lys Glu Thr Cys Phe Ala Glu Glu Gly Lys Lys Leu Val 
565 570 575 

GCT GCT TCT CAA GCT GCT TTG GGT TTG TAA 1758 
Ala ALa Ser Gin Alt Ala Leu Gly Leu . 

580 585 

C2) IMraRHATION FDR SEQ ID N0:2: 

Ci) SEOUEHCE CHARACTERISTICS: 

(A) LENGTH: 255 bMe pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNE5S: unknoyn 
(D) T{3P0L0GY: unknoMn 

Cfi) MOLECULE TYPE: cOMA 

Cix) FEATURE: 

CA) NAHE/KEY: CDS 

CB) LOCATIOM: 1..255 
CD) OTHER INFORHATtON: 

Cxi) SEQUENCE DESCRIPTION: SEQ ID It0:2: 

ATG AGA Tn CCT TCA AH TTT ACT CCA GVt TTA TTC GCA 6CA TCC TCC 48 
Met Arg Phe Pro Ser He Phe Thr Ala Val Leu Phe Ala Ale Ser Ser 

15 10 15 

GCA TTA GCT GCT CCA CTC AAC ACT ACA ACA GAA GAT GAA ACG GCA CAA 96 
Ala Leu Ala Ala Pro Val Asn Thr Thr Thr Glu Asp Glu Thr Ala Gin 

20 25 30 

ATT CCC GCT GAA GCT GTC ATC GGT TAC TCA GAT TTA GAA GGG GAT TTC 144 
He Pro Ala Glu Ala Val He Gly Tyr Ser Asp Leu GLu Gly Asp Phe 

35 40 45 

GAT GTT GCT GTT TTG CCA TTT TCC AAC AGC ACA AAT AAC GGG TTA TTG 
Asp Val Ala Val Leu Pro Phe Ser Asn Ser Thr Asn Asn Gly Leu Leu 
50 55 60 

TTT ATA AAT ACT ACT ATT GCC AGC ATT GCT GCT AAA GAA GAA GGG GTA 
Phe Ke Asn Thr Thr lie Ala Ser He Ala Ala Lys Glu Glu Gly Val 
65 70 75 80 

TCT TTG GAT AAA AGA 

Ser Leu Asp Lys Arg 
65 



C2) INFORHATIOH FOR SEQ ID H0:3: 

Ci) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 72 base pairs 

CB) TYPE: nucleic acid 
CO STRAHDEDNEbS: unknowi 
CD) TOPOLOGY: unknown 

Cii) MOLECULE TYPE: cDHA 

Cix) FEATURE: 

CA) NAHE/KEY: CDS 



192 



240 



255 
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(B) LOCATIOH: 1..72 
(D) OTHER INFORMATION: 



(xi) SEQUENCE DESCRIPTION: SEO ID N0:3: 

ATG AAG TGG GTT ACT HC ATT TCT TTG TTG TTC TTG TTC TCT TCT GCT 48 
Het Lys Trp Val Thr Phe lie Ser Leu Leu Phe Leu Phe Ser Ser Ala 
15 10 15 

TAG TCT AGA GGT Cn TTC AGA AGA 72 

Tyr Ser Arg Gly Val Phe Arg Arg 
20 

(2) IHFOftMATION FOR SEQ ID N0:4: 

(i) SEQUENCE OMRACTEftlSTlCS: 
(A) LENGTH; 43 bue pairs 
(B> TYPE: nucltic »cfd 
(C> SntANDBtMESS: sir«le 
(D> TOPOLOGY: LinMr 

(it) MOLECULE TYPE: DNA (genowic) 



(Xi) SEQUENCE DESCRIPTION: SEQ 10 N0:4: 
TGCTTTCGGT TTGTAAfiAAT TCGGATCCCG TAATCATGGT CAT 43 
(2) INFOftHATION FOR SEQ ID N0:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 bM* pairs 

(B) TYPE: nucleic acid 

(C) STRANDB>NESS: single 

(D) TOPOLOGY: linear 

Cil) MOLECULE TYPE: DNA Coenowic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:S 
nCTAAGAAT TCGGATCCCG TAAT 
(2) INFOANATION FOR SEQ ID N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 bne patrs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA (genonic) 



(xi) SEQUENCE DESCRIPTION: SEO ID H0:6: 

TGAAAGAGCC TTCAAAGCTT GGGCTGTTGC TAGATT 

C2) IMFORHATION FOR SEO ID NO: 7: 

Ci) SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 17 base pairs 
(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
tD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



24 



36 



(xi) SEQUENCE DESCRIPTION: SEQ ID M0:7: 
CTTCAAACCT TGGCCTG 



17 
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(2) INFORMATION FOR SEQ ID N0:8: 

(i) SEQUENCE CHARACTERISTICS: 
CA} lEJKGTH: 38 base pairs 
(B) TYPE: ntx:letc acid 
CO STRANOB)NESS: single 
CD) TOPOLOGT: linear 

(fi) HOLECULE TYPE: DNA (genomic) 



Cxi) SEQUENCE DESCRIPTION: SEQ ID N0;8: 
GTATCTTTGC ATAAAACAGA CCCTCACAAC TCTCAAGT 38 
(2) INFORMATION FOR SEQ ID N0:9: 

(i) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 18 twse pairs 

CB) TYPE: nucleic acid 
CO STRANDBMESS: single 
CD) TOPOLOGY: linear 

Cli) MOUCULE TYPE: DHA Cgenoafc) 



Cxi) SEQUENCE DESCRIPTION: 5E0 ID N0:9: 
GATAAAAGAG ACGCTCAC 18 
C2) INFORMATION FOR SEQ ID NO: 10: 

Ci) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 69 base pairs 

CB) TYPE; nucleic acid 
CO STRANDS3NESS: single 
CD) TOPOLOGY: linear 

(if) MOLECULE TYPE: DNA Cgenowtc) 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
GCTTGCATGC CTGCAGAAH CATGAAGTGG GTTACTTTCA TTTCTTTGTT GTTCGACGCT 60 
CACAAGTa 69 
(2) INFORMATION FOR SEQ ID NO:!!: 

Ci) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 18 base pairs 

CB) TYPE: nucleic acid 
CO STRANDEDNESS: single 
CD) TOPOLOGY: linear 

Cli) MOLECULE TYPE: DNA (genoaic) 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NOrll: 
ACTTTCATTT crTTGTTG IB 
f2) INFORMATION FOR SEQ ID «>;12; 

(i) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 69 base pairs 

CB) TYPE: nucleic acid 
(C) STRANOEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA Cgenowic) 
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Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
ATTTCTTTGT TGTTCTTGTT CTCTTCTGCT TACTCTAGAG GTGTTTTCAG AAGAGACCCT 60 
CACAAGTCT 69 
t2) INFOtmATION FOR SEQ ID N0:t3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base piirs 

(B) TYPE: nucleic acid 

(C) STRANDBWESS: single 

(D) TOPOLOGY: linear 

(ii) MLECULE TYPE: DNA <9enaiic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TCTGCTTACT CTAGAGGT 18 

(2) INFORMATION FOR $80 ID NOrU: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 33 bme pairs 
CB) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ti> MOLECULE TYPE: DMA (genanic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:U: 

AATTCGATGA GATTTCOTC AATTTTTACT GCA 33 

C2> INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 25 base pairs 
(B> TYPE: nucleic acid 
(C) STRANDEDNESS: sin0le 
(0) TOPOLOGY: linear 

(ii) HOLECULE TYPE: DNA (genoMic) 

(xi) SEQUENCE DESCRIPTION: SEO ID M0:15: 
GTAAAAATTG AAGGAAATCT CATCG 25 
(2) INFORMATION FOR SEQ ID WO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECKLE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

GAATTAATTC 10 

(2) INFORMATION FOR SEQ ID N0:17: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: H base pairs 
(8) TYPE: nucleic acid 
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CXT) SEQUEHCE DESCRIPTION: SEQ ID M0:20: 

TTGCGGGATG TCCATTCC 

(E) INFORMATION FOR SEQ ID N0:21: 

CO SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
CD) TOPOLOGY: linear 

Cii) HOLECULE TYPE: DNA (genomic) 



14 



CO STRANOEDNESS: single 
CD) TOPOLOGY: linear 

Cii) HOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CAGCAGATCT GCTG 

C2) INFORMATION FOR SEQ ID NO: 18: 

Ci) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 bKe pairs 
CB) TYPE: nucleic acid 
(C) STRANDCDNESS: single 
CD) TOPOLOGY: linear 

Cii) MOLECULE TYPE: DNA (genopic) 



Cxi) SEQUBICE DESCRIPTION: SEQ ID MO: 18: 
GACGTTCGTT TGTGCCGATC CAATGCGGTA GTTTAT ^6 
CE) INFORMATION FOR SEQ ID N0:19: 

Ci) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 30 base pairs 

CB) TYPE: nucleic acid 
CO STRAHDH)NESS: single 
CD) TOPOLOGY: linear 

Cii) MOLECULE TYPE: DNA Cgenomic) 

Cxi) SEQUEMCE DESCRIPTION: SEQ ID NO: 19: 
GGCCTCTTGC G6GATCTCCA TTCC6ACAGC 30 
C2) INFORMATION FOR SEQ ID N0:20: 

Ci) SEQUENCE CHARACTERISTICS: 

CA) LENGTH: 18 base pairs 

CB) TYPE: nucleic acid 
CO 5TRANDB)HE5S: single 
CD) TOPOLOGY: linear 

Cii) MOLECULE TYPE: DNA (genomic) 



18 



(xi) SEQUENCE DESCRIPTION: SEQ ID H0:21: 
TTTGTGCAAG CTTATGGATC CGCTTTAATG CGGTAGT 
(2) INFORMATION FOR SEQ ID NO:22: 



37 
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(i) SEQUEWTE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic •cid 

(C) STRANDEDMESS: single 
CD] TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genomic) 



tx)> SEQUEHCE DESCRIPTION: SEO ID N0:22 
TTATGGATCC GCTTT 

C2) INFORMATION FOR SEQ ID H0:Z3: 

(f) SEOUEMCE CHARACTERISTICS: 
(A) LENGTH: 38 bMe pairs 
<B) TYPE: nucleic acid 

(C) STUNMONESS: tinele 

(D) TOPOLOGY: linear 

(ii) MOLEOIU TYPE: DMA (genoaic) 



<xi) SEQUENCE DESCRIPTION: SEQ 10 N0:E3 

TTCGAGaCC BTACaAAGG ATCCTGAGAT AAATTTCA 

(2) INFORMATION FOR SEQ ID NQ:24: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 bftse pairs 
(8) TYPE: nucleic acid 

(C) STRANOB)NES$: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DMA (genoiiic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24 
TACCTAAGGA TCCTGAG 
(2) INFORMATION FOR SEO ID NO:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDMESS: single 
CD) TOPOLOGY: linear 

Cii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2S: 

ACGTTGTCAC TGAAGGC6GC CGCAGTATCT ACAAACC 

(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 16 base pairs 
CB) TYPE: nucleic acid 
(C) STRANDEDMESS: single 
CD) TOPaOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID H0:26: 
TGAA6GCGGC CGCACT 
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THAT WHICH IS CLAIMED IS: 

1. A DNA fragment comprising an expression 
cassette, wherein said expression cassette comprises, in 
the direction of transcription, the following DNA 

5 sequences : 

(i) a promoter region of a methanol responsive 
gene of a methylotrophic yeast, 

(ii) a DNA sequence encoding a polypeptide 
consisting of: 

10 (a) a secretion signal sequence selected 

from: 

(1) the S. cerevisiae AMF pre-pro 
sequence, including the processing site: lys-arg, or 

(2) the native HSA signal sequence 

15 and 

(b) an HSA peptide; and 

(iii) a transcription terminator functional in 
a methylotrophic yeast, wherein said DNA sequences are 
operationally associated with one another for 

20 transcription of the sequences encoding said polypeptide. 

2. A DNA fragment according to Claim 1 further 
comprising at least one selectable marker gene and a 
bacterial origin of replication. 

25 

3. A DNA fragment according to Claim 2 wherein 
said fragment is contained within a circular plasmid. 

4. A DNA fragment according to Claim 1 wherein 

30 said sequence encoding an HSA peptide encodes the natural 
585 amino acid form of HSA. 

5. A DNA fragment according to Claim 1 wherein 
said methylotrophic yeast is a strain of Pichia pastoris. 

35 
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6. A DNA fragment according to Claim 5 wherein 
said methanol responsive gene of a methylotrophic yeast 
and the transcription terminator are both derived from 
the P. pastoris AOXl gene. 

5 

7. A DNA fragment according to Claim 6 further 
comprising 3'- and 5 '-ends having sufficient homology 
with a target gene of a yeast host for said DNA fragment 
to effect site directed integration of said fragment into 

10 said target gene. 

8. A DNA fragment according to Claim 1 further 
comprising 3*- and 5 • -ends having sufficient homology 
with a target gene of a yeast host for said DNA fragment 

15 to effect site directed integration of said fragment into 
said target gene. 

9. A DNA fragment according to Claim 1 containing 
multiple copies of said expression cassette. 

20 

10. A DNA fragment according to Claim 7 containing 
multiple copies of said expression cassette. 

11. A DNA fragment according to Claim 9 wherein 
25 said multiple copies of said expression cassette are 

oriented in head-to-tail orientation. 

12. A DNA fragment according to Claim 7, which is 
derived from a Sai l digest of the Pichia expression 

3 0 vector pHSAlll or pHSA211. 



13. A DNA fragment according to Claim 7, which is 
the Pichia expression vector pHSA211. 



wo 92/13951 



PCT/US92/01015 



-60- 

14. A DNA fragment according to Claim 10, which is 
derived from a Sai l digest of the Pichia expression 
vector PHSA212, pHSA214 or pHSA216. 

5 15. A DNA fragment according to Claim 10, which is 

the Pichia expression vector pHSA212, pHSA214 or pHSA216. 

16. A methylotrophic yeast cell transformed with 
the DNA fragment of Claim 1. 

10 

17. A methylotrophic yeast cell according to claim 
16 wherein said yeast is a strain of Pichia pastoris. 

18. A methylotrophic yeast cell transformed with 
15 the DNA fragment of Claim 4. 

19. A methylotrophic yeast cell according to Claim 
18 wherein said yeast is a strain of Pichia pastoris. 

20 20. A P. pastoris cell transformed with the DNA 

fragment of Claim 5. 

21. A P. pastoris cell transformed with the DNA 
fragment of Claim 6. 

25 

22. A P. pastoris cell transformed with the DNA 
fragment of Claim 7. 

23. A methylotrophic yeast cell transformed with 
30 the DNA fragment of Claim 8. 

24. A methylotrophic yeast cell transformed with 
the DNA fragment of Claim 9. 

35 25. A methylotrophic yeast according to Claim 24 

wherein said yeast is a strain of Pichia pastoris. 
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26. A methylotrophic yeast cell transformed with 
the DNA fragment of Claim 11. 

27. A P. pastoris cell transformed with the DNA 
5 fragment of claim 12. 

28. A P. pastoris cell according to Claim 27 
wherein said cell is selected from strain G+HSA211S4 or 
G+HSA211S6. 

10 

29. A P. pastoris cell transformed with the DNA 
fragment of Claim 14, 

30. A P. pastoris cell according to Claim 29 

15 wherein said cell is selected from strain G+HSA212S31, 
G+HSA212S32, G+HSA214S34, G+HSA214S40, G+HSA214S51, 
G+HSA216S44, G+HSA216S47 or G+HSA216S56. 

31. A culture of viable P. pastoris cells according 
20 to Claim 17. 

32. A culture of viable P. pastoris cells according 
to Claim 28. 

25 33. A culture of viable P. pastoris cells according 

to Claim 30. 

34. A process for producing HSA, said process 
comprising growing the cells of Claim 16 under conditions 

3 0 allowing the expression of said expression cassette (s) in 
said cells, and the secretion of said HSA product into 
the culture medium. 

35. A process according to Claim 34 wherein said 
35 methylotrophic yeast is a strain of Pichia pastoris. 
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36, A process according to Claim 34 wherein said 
cells are grown in a medium containing methanol as a 
carbon source. 

5 37. A process according to Claim 34 wherein said 

cells have the Muf phenotype. 

38. A process according to Claim 34 wherein said 
cells have the Mut"*" phenotype. 

10 

39. A process according to Claim 38 wherein said 
cells are selected from strain G+HSA211S4, G+HSA211S6, 
G+HSA212S31, G+HSA212S32, G+HSA214S34, G+HSA214S40, 
G+HSA214S51, G+HSA216S44, G+HSA216S47 or G+HSA216S56. 

15 

40. As a composition of matter, the expression 
vector pA0856. 
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